Geomagnetic Signal Analysis Based Classification of Earthquake Magnitudes

Open Table of contents

Introduction
Methodology
- Feature Extraction Methods
  - 1. Mel Frequency Ceptral Coefficient
  - 2.Continues Wavelet Transform
- Dimension Reduction Methods
  - 1. Principal Component Analysis
Classification of Magnetic Features
- - 1. Support Vector Machine
  - 2. K Nearest Neighbor
Results and Duscussions

Introduction

In this paper, we explore the effect of earthquakes which can be occurred in any place of the world on geomagnetic waves during a day. To achieve this goal, all magnetic data which is recorded in the same station should be organized based on the maximum earthquake in a day.

According to the studies, there is not a day in a world that the earthquake does not happen. Indeed, hundreds of earthquakes in any place of the earth are rising continuously. Through this fact and using various feature extraction methods like Mel Frequency Cepstral Coefficient (MFCC) and Continues Wavelet Transform (CWT) relevant discriminant features are extracted from the event signal.

By using these features and give them to dimension reduction algorithms like Linear Discrimination Analysis (LDA) and Principle Component Analysis (PCA) separately, the classification system was built based on the K- Nearest Neighbor (KNN) and Simple Vector Machine (SVM) methods. Classification results on real geomagnetic data indicate that the unique structure of the geomagnetic waves with the effect of the big earthquakes can be detected and classified in high accuracy.

Methodology

The method is comprised of three stages, namely, i. feature extraction methods, ii. dimension reduction algorithm iii. apply classification approaches. Each stage is explained in more detail below.

Feature Extraction Methods

1. Mel Frequency Ceptral Coefficient

Mel Frequency Cepstral Coefficient (MFCC) is a popular method for extracting features from audio signals, especially in speech and sound analysis. Here’s a short summary of the process:

Framing: Break the audio signal into small short-time frames where the signal is assumed to be stable.
Power Spectrum:: Apply the Fourier Transform to each frame to get the frequency content and compute its power (Periodogram).
Mel Filterban:: Pass the power spectrum through a set of filters spaced according to the Mel scale, which mimics human perception of pitch.
Logarithm : Take the log of the filterbank energies to compress the dynamic range and allow for normalization.
Discrete Cosine Transform (DCT):: Apply DCT to decorrelate the log energies and compact the information.
Select Coefficient:: Keep only coefficients 2 to 13, which hold the most relevant info; discard the rest.

This method effectively converts a time-domain audio signal into a compact set of frequency-domain features.

2.Continues Wavelet Transform

The Wavelet Transform (WT) is a powerful tool for analyzing non-stationary signals, as it captures both time and frequency characteristics. Unlike traditional Fourier analysis, which provides only frequency information, WT allows us to see how the frequency content of a signal changes over time. It does this by convolving the signal with a set of wavelet functions—short, oscillating filters—at various scales and time positions. These wavelets represent different frequency bands, and by adjusting the scale, the transform can focus on high-frequency (small scale) or low-frequency (large scale) features.

The most commonly used wavelets for spectral analysis are the Morlet or Gabor wavelets, which have a Gaussian envelope. The result of this process is a set of wavelet coefficients, which indicate how much of the wavelet is present in the signal at a given time and scale. This produces a time-scale representation that helps in detecting localized features in the signal. The Continuous Wavelet Transform (CWT), defined as an integral of the product of the signal and scaled wavelet functions, is especially useful for applications where frequency components vary over time, such as in seismic, audio, or geomagnetic signal analysis.
The integral of the function is given by ( \int x(t), dt ).

Dimension Reduction Methods

Here two kinds of dimension reduction approaches which are namely, i. Principle component analysis and ii. Linear Discrimination Algorithm are explained in more details.

1. Principal Component Analysis

PCA is a technique used to reduce the number of variables in a dataset while keeping the most important information. It works by transforming the original variables into a new set of uncorrelated variables called principal components, which are sorted by how much variance (information) they capture.

Simplify complex datasets.
Outliers can strongly affect PCA, so they should be removed first.
Handle missing data in simple ways (e.g., filling with averages).

PCA uses eigenvalues and eigenvectors from the covariance matrix of the data to find the most important directions components in the data.

After PCA, you can reduce thousands of features into just a few and use them for classification or other analysis.

Classification of Magnetic Features

1. Support Vector Machine

Support Vector Machine (SVM) is a supervised machine learning algorithm using linear or non-linear kernels for classification or regression challenges which has effective performance in high dimensions.

The goal of this algorithm is to find optimal separating hyperplane between classes by focusing on the training cases which maximize the margin (distance) between classes and minimize the error

2. K Nearest Neighbor

It is a kind of simple classification algorithm and has a low error rate. At the beginning of the algorithm, a train data set with accurate classification labels should be known. Then for a q_i , whose label has not defined the distances between it and every point in the train data set should be calculated. After sorting the results of distances, the decision of the class label of the test point qi can be made according to the label of the k nearest points in the train data set [33]. The distance between two points can be defined in many ways by Using Euclidean which is defined in equation.

def knn_classify(test_point, training_data, k, distance_fn):
    """
    test_point: the unknown point to classify
    training_data: list of tuples (x_i, y_i)
    k: number of neighbors
    distance_fn: function to compute distance between two points
    """
    distances = []

    # Step 1: Compute distance from test_point to all training points
    for x_i, y_i in training_data:
        d = distance_fn(test_point, x_i)
        distances.append((d, y_i))

    # Step 2: Sort distances
    distances.sort(key=lambda tup: tup[0])

    # Step 3: Select K nearest neighbors
    k_nearest = distances[:k]

    # Step 4: Count class votes
    from collections import Counter
    labels = [label for _, label in k_nearest]
    majority_vote = Counter(labels).most_common(1)[0][0]

    return majority_vote

The choice of parameter K (neighbor) is very important and could affect the classification results. For K parameter value it should not be the multiple of the number of the classes and we should choose an odd value for k in binary class problems. If the dimension of the data set increased, the time complexity of this algorithm will be increased.

Results and Duscussions

In this part, we need to prepare our data which is recognized as magnetic data to use in the implementation process. Here you can find requirement steps in more detail to prepare data.

Catalog Preparation

At first step, we need to prepare the Earthquake catalog which has earthquakes information. To achieve this goal, earthquakes as large as 2.5 and more between 2007 and 2018 years which are recorded all over the world are downloaded from this website: https://earthquake.usgs.gov/earthquakes/search/. In this excel file, all information about selected earthquakes including location, longitude, latitude, time, magnetic, depth, and some other extra information exist.
According to the data reviews, we noticed that in each hour of a day hundreds of earthquakes are happened in all over the world. So, data should be organized in a comprehensible structure. For this reason, each day maximum earthquake information is kept and the final catalog built. Now, the created catalog consist of each day maximum earthquake information between 2007 and 2018 years.

Date	Time	Latitude	Longitude	Depth (km)	Magnitude	Place
2007-01-01	00:01:21	10.981	-85.325	40.4	4.2	Costa Rica

In addition, we count the number of earthquakes based on their maximum magnitude as you can find in Table 4.2. It means that the number of 5 Richter earthquakes which is a maximum earthquake between 2007 and 2017 years is 111. The total number of earthquakes is 4383.

Magnitude	Count	Magnitude	Count	Magnitude	Count	Magnitude	Count
4.7	1	4.8	18	4.87	1	4.9	56
5.0	111	5.06	1	5.1	200	5.2	289
5.29	1	5.3	336	5.35	1	5.36	1
5.4	337	5.5	376	5.6	357	5.7	315
5.71	1	5.8	309	5.88	1	5.9	287
6.0	260	6.1	213	6.2	142	6.3	153
6.4	106	6.5	91	6.6	71	6.7	61
6.8	51	6.9	54	7.0	33	7.1	30
7.2	21	7.3	21	7.4	12	7.5	14
7.6	9	7.7	10	7.8	12	7.9	6
8.0	2	8.1	3	8.2	3	8.3	2
8.4	1	8.6	1	8.8	1	9.1	1

Total Earthquakes: 4383

Geomagnetic Data Preparation

Iznik-Turkey station is selected to downloaded geomagnetic data based on the catalog. In this station, data recorded in each minute in three different dimensions North (X), East (Y), and Z. It means that each dimension has 1440 data points in a day 60(minutes) × 24(hours) as you can find in Figure below.

Free Classic wooden desk with writing materials, vintage clock, and a leather bag. Stock Photo — Geomagnetic Signal of One Day in Three Dimension

The reason for the selection years between 2007 and 2017 is that Iznik station has started its working from 2007. In the last step of this stage, data is divided into two classes.

Class	Threshold	Total
1	< 5.4	1016
2	> 6.1	911

Performance Analysis of Algorithms

Here, four different scenarios are defined in order to evaluate the SVM and KNN classification algorithm performance. Feature extraction and dimension reduction in different situations apply to geomagnetic data through MFCC, CWT, and PCA, LDA respectively. The result of each scenario will be described in more detail in each part separately.

1. First Scenario Results

In this strategy, we concatenate three axes of one-day data together before applying dimension reduction methods. Finally, the result of SVM and KNN classification methods will be analyzed in different cases which are indicated in Table 4.6 and Table 4.7 based on Figure 4.4:

	Case 1	Case 2	Case 3	Case 4	Case 5
Kernel	SVM Results after PCA	SVM Results After LDA Solver=’SVD’	SVM Results After LDA Solver=’Eigen’	SVM Results After PCA and LDA Solver=’SVD’	SVM Results After PCA and LDA Solver=’Eigen’
Polynomial	47.28	47.28	98.39	52.68	47.28
RBF	52.72	98.46	64.94	46.43	51.34
Sigmoid	54.82	98.89	52.72	49.02	50.50
Table: SVM Mean of Accuracy in a First Scenario

	Case 1	Case 2	Case 3	Case 4	Case 5
Kernel	KNN Results	KNN Results After LDA Solver=’SVD’	KNN Results After LDA Solver=’Eigen’	KNN Results After PCA Solver=’SVD’	KNN Results After PCA and LDA Solver=’Eigen’
Auto	25.64	98.60	61.14	52.42	49.20
Ball Tree	25.64	98.60	61.14	53.92	49.20
Kd Tree	26.27	98.70	60.98	53.45	49.30
Brute	25.64	98.60	61.14	53.92	49.20
Table: KNN Accuracy in a First Scenario

As you can find in Table the result of the SVM algorithm after applying LDA Algorithm with SVD Solver, provides the best outputs in all kernels especially in Sigmoid kernel.
According to KNN classification results, the best accuracy is related to a situation when LDA with SVD solver applied on raw data.

2. Second Scenario Results

In this scenario, dimension reduction methods are applied on each axis separately and then concatenating together in order to use as an input for SVM and KNN classification algorithms.

	Case 1	Case 2	Case 3	Case 4	Case 5
Kernel	SVM Results	SVM Results After LDA Solver=’SVD’	SVM Results After LDA Solver=’Eigen’	SVM Results After PCA Solver=’SVD’	SVM Results After PCA and LDA Solver=’Eigen’
Polynomial	47.28	62.70	100.00	38.92	43.61
RBF	52.72	99.99	100.00	94.13	83.45
Sigmoid	96.25	99.99	52.72	95.83	90.37
Table: SVM Mean of Accuracy in a Second Scenario

	Case 1	Case 2	Case 3	Case 4	Case 5
Kernal	KNN Results	KNN Results After LDA Solver=’SVD’	KNN Results After LDA Solver=’Eigen’	KNN Results After PCA and LDA Solver=’SVD’	KNN Results After PCA and LDA Solver=’Eigen’
Auto	94.04	100.00	100.00	94.40	91.78
Ball Tree	94.04	100.00	100.00	94.40	91.78
Kd Tree	94.04	100.00	100.00	94.45	90.80
Brute	94.04	100.00	100.00	94.45	91.78
Table: KNN Accuracy in a Second Scenario

• In SVM Classification methods After LDA with solver Eigen and Polynomial and RBF kernels have high accuracy. • Generally, KNN indicates high performance

3. Third Scenario Results

In this scenario, extracted features through the MFCC approach are analyzed in various cases. Selected features could be controlled by n_mffc parameter.

	Case 0	Case 1	Case 2	Case 3	Case 4	Case 5
Kernel	SVM Results	SVM Results After PCA	SVM Results After LDA Solver=’SVD’	SVM Results After LDA Solver=’Eigen’	SVM Results After PCA and LDA Solver=’Eigen’	SVM Results After PCA and LDA Solver=’Eigen’
Polynomial	92.69	65.38	40.21	93.43	41.36	52.72
RBF	79.48	65.44	93.05	93.55	93.88	92.15
Sigmoid	52.72	67.00	95.69	52.72	95.64	92.26
Table 3.1: SVM Mean of Accuracy in a Third Scenario

	Case 0	Case 1	Case 2	Case 3	Case 4	Case 5
Kernel	KNN Results	KNN Results After PCA	KNN Results After LDA Solver=’SVD’	KNN Results After LDA Solver=’Eigen’	KNN Results After PCA and LDA Solver=’SVD’	KNN Results After PCA and LDA Solver=’Eigen’
Auto	86.44	89.77	92.75	94.72	94.72	93.05
Ball Tree	86.44	89.77	92.75	94.72	94.72	93.05
Kd Tree	85.35	89.67	92.64	94.41	94.40	93.00
Brute	86.44	89.77	92.75	94.72	94.72	93.05
Table: KNN Accuracy in a Second Scenario

• According to Table 4.13 SVM efficiency after LDA with SVD solver and Sigmoid kernel is in its high accuracy. We notice that after PCA and LDA SVM accuracy by setting RBF and Sigmoid kernels increased.

4. Fourth Scenario Results

In this scenario, after applying the CWT feature extraction algorithm on geomagnetic data, it is utilized as an input for classification methods after passing through dimension reduction methods.

	Case 0	Case 1	Case 2	Case 3	Case 4	Case 5
Kernel	SVM Results	SVM Results After PCA	SVM Results After LDA Solver=’SVD’	SVM Results After LDA Solver=’Eigen’	SVM Results After PCA and LDA Solver=’Eigen’	SVM Results After PCA and LDA Solver=’Eigen’
Polynomial	92.78	47.28	96.83	47.28	38.61	57.48
RBF	84.85	52.72	99.61	93.99	93.17	79.64
Sigmoid	52.72	96.77	100.00	52.72	96.07	85.56
Table 3.1: SVM Mean of Accuracy in a Third Scenario

	Case 0	Case 1	Case 2	Case 3	Case 4	Case 5
Kernel	KNN Results	KNN Results After PCA	KNN Results After LDA Solver=’SVD’	KNN Results After LDA Solver=’Eigen’	KNN Results After PCA and LDA Solver=’SVD’	KNN Results After PCA and LDA Solver=’Eigen’
Auto	94.61	94.14	100.00	94.72	93.05	84.22
Ball Tree	94.61	94.14	100.00	94.72	93.05	84.22
Kd Tree	94.71	94.30	100.00	94.41	92.53	83.19
Brute	94.61	94.14	100.00	94.72	93.05	84.22
Table: KNN Accuracy in a Second Scenario

. SVM algorithm after LDA with SVD solver in all kernels produces high outputs as you can see in Table 4.16. After applying PCA, in sigmoid kernels, accuracy is increased. • KNN classification in all kernels has high accuracy especially after LDA with SVD Solver.

Comparison Results of SVM and KNN Classification Methods

First scenario indicates that in case 2, both KNN and SVM methods have the almost same operation in all kernels and in case 2, they have high accuracy performance.
In a second strategy, shows that RBF kernel in case 2, case 3, case 4 and 5 have almost the same accuracy efficiency, On the other hand, all algorithms of the KNN classification method (b) reach the highest accuracy performance in case 2 and case 3.
In a third scenario RBF kernel accuracy is behaving increasingly in case 2, case 3, case 4, and case 5. In contrast, the KNN approach acts almost in high accuracy in all cases especially in case 3 and case 4.
According to the last scenario, RBF kernel operates in high accuracy in case 2, case 3, case 4, and case 5. For KNN classification, all algorithms in all cases achieve excellent accuracy specifically in case 2.