Hendrickson-Bray motional narrowing analysis processing pipeline¶

This script processes data by: 1. Reading Bruker format NMR data files 2. Fitting peaks and analyzing lineshapes 3. Calculating widths and errors normalized to Larmor frequency 4. Plotting calculated widths vs temperature 5. Fitting Hendrickson-Bray model to data 6. Exporting the fitted parameters to a CSV file 7. Plotting the data and fitted model

Functions:

   get_filepaths(directory): Lists and sorts files in directory by integer value

   get_nmr_paths(directory, filepaths): Constructs full paths to NMR data files

   get_larmor_freq(path): Extracts Larmor frequency from Bruker data dictionary

   process_nmr_data(nmr_paths, processor): Main processing pipeline that:
       - Loads NMR data
       - Selects region of interest (-500 to 500 ppm)
       - Normalizes data
       - Fits peaks with given initial parameters -- this is an important step, 
            as the initial parameters are crucial for the fitting process. If you have multiple peaks, that is another story.
            Anyway, ensure that the first initial parameters are that of the peak that correlates to the atomic site of interest.
       - Plots and saves results
       - Calculates widths normalized to Larmor frequency

import sys
import os
import matplotlib.pyplot as plt
sys.path.append("../../src")
from nmrlineshapeanalyser.core import NMRProcessor
import nmrglue as ng
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
import pandas as pd

def get_filepaths(directory):
    """Sort directory contents numerically"""
    filepaths = os.listdir(directory)
    filepaths.sort(key=lambda x: int(x))
    return filepaths

def get_nmr_paths(directory, filepaths):
    nmr_paths = []
    for filename in filepaths:
        f = os.path.join(directory, filename)
        nmr_paths.append(os.path.join(f, 'pdata', '1'))
    return nmr_paths

def get_larmor_freq(path):
    dic, data = ng.bruker.read(path)
    udic = ng.bruker.guess_udic(dic, data)
    larmor_freq = udic[0]['obs']
    return larmor_freq

def process_nmr_data(nmr_paths, processor):
    widths = []
    width_errors = []
    for i, nmr_path in enumerate(nmr_paths):
        processor.load_data(nmr_path)
        x_region, y_region = processor.select_region(-500, 500)
        x_data, y_data = processor.normalize_data(x_region, y_region)
        initial_params = [-10, 1, 110, 0.5, 100]
        fixed_x0 = [False]
        popt, metrics, fitted = processor.fit_peaks(x_data, y_data, initial_params, fixed_x0)
        fig, axes, components = processor.plot_results(x_data, y_data, fitted, popt)
        legend = fig.gca().get_legend()
        legend.set_frame_on(False)
        fig.set_size_inches(5, 4)
        path_components = nmr_path.split('\\')
        exp_folder = path_components[path_components.index('pdata') - 1]
        fig.suptitle(f'nmr folder: {exp_folder}', x=0.75, y=0.95)

        processor.save_results(nmr_path, x_data, y_data, fitted, metrics, popt, components)
        larmor_freq = get_larmor_freq(nmr_path)
        width = metrics[0]['width'][0] * larmor_freq / 1000
        width_error = metrics[0]['width'][1] * larmor_freq / 1000
        widths.append(width)
        width_errors.append(width_error)
    return widths, width_errors

if __name__ == "__main__":
    directory = r'..\..\data\single_peak_HB_analysis\lineshape'
    processor = NMRProcessor()
    filepaths = get_filepaths(directory)
    nmr_paths = get_nmr_paths(directory, filepaths)
    widths, width_errors = process_nmr_data(nmr_paths, processor)


Peak Fitting Results:
===================

Peak 1 (Position: -21.00 ± 0.05):
Amplitude: 1.001 ± 0.001
Width: 182.66 ± 0.13 in ppm
Width: 85961.80 ± 61.51 in Hz
Eta: 0.13 ± 0.00
Offset: 0.016 ± 0.000
Gaussian Area: 169.99 ± 0.86
Lorentzian Area: 36.42 ± 1.25
Total Area: 206.40 ± 1.52
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 1.04%
Overall Percentage is 100.00% ± 1.04%

Peak Fitting Results:
===================

Peak 1 (Position: -23.06 ± 0.12):
Amplitude: 1.010 ± 0.002
Width: 181.66 ± 0.36 in ppm
Width: 85495.01 ± 168.75 in Hz
Eta: 0.13 ± 0.01
Offset: 0.014 ± 0.001
Gaussian Area: 170.77 ± 2.38
Lorentzian Area: 36.26 ± 3.44
Total Area: 207.03 ± 4.18
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 2.85%
Overall Percentage is 100.00% ± 2.85%

Peak Fitting Results:
===================

Peak 1 (Position: -22.76 ± 0.04):
Amplitude: 1.009 ± 0.001
Width: 179.69 ± 0.12 in ppm
Width: 84564.99 ± 56.90 in Hz
Eta: 0.14 ± 0.00
Offset: 0.014 ± 0.000
Gaussian Area: 166.08 ± 0.79
Lorentzian Area: 39.62 ± 1.14
Total Area: 205.71 ± 1.38
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.95%
Overall Percentage is 100.00% ± 0.95%

Peak Fitting Results:
===================

Peak 1 (Position: -23.76 ± 0.04):
Amplitude: 1.009 ± 0.001
Width: 175.33 ± 0.11 in ppm
Width: 82514.48 ± 51.15 in Hz
Eta: 0.17 ± 0.00
Offset: 0.010 ± 0.000
Gaussian Area: 157.09 ± 0.68
Lorentzian Area: 46.18 ± 0.99
Total Area: 203.27 ± 1.20
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.84%
Overall Percentage is 100.00% ± 0.84%

Peak Fitting Results:
===================

Peak 1 (Position: -26.12 ± 0.03):
Amplitude: 1.000 ± 0.001
Width: 169.71 ± 0.10 in ppm
Width: 79870.23 ± 47.60 in Hz
Eta: 0.23 ± 0.00
Offset: 0.012 ± 0.000
Gaussian Area: 139.74 ± 0.59
Lorentzian Area: 60.49 ± 0.86
Total Area: 200.23 ± 1.04
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.74%
Overall Percentage is 100.00% ± 0.74%

Peak Fitting Results:
===================

Peak 1 (Position: -24.67 ± 0.02):
Amplitude: 0.989 ± 0.000
Width: 158.86 ± 0.07 in ppm
Width: 74760.53 ± 32.71 in Hz
Eta: 0.35 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 109.41 ± 0.36
Lorentzian Area: 85.39 ± 0.52
Total Area: 194.80 ± 0.63
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.46%
Overall Percentage is 100.00% ± 0.46%

Peak Fitting Results:
===================

Peak 1 (Position: -26.37 ± 0.02):
Amplitude: 0.957 ± 0.000
Width: 141.17 ± 0.08 in ppm
Width: 66435.62 ± 38.31 in Hz
Eta: 0.56 ± 0.00
Offset: -0.004 ± 0.000
Gaussian Area: 63.27 ± 0.33
Lorentzian Area: 118.93 ± 0.49
Total Area: 182.20 ± 0.59
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.46%
Overall Percentage is 100.00% ± 0.46%

Peak Fitting Results:
===================

Peak 1 (Position: -26.49 ± 0.03):
Amplitude: 0.946 ± 0.001
Width: 107.56 ± 0.12 in ppm
Width: 50619.60 ± 55.56 in Hz
Eta: 1.00 ± 0.00
Offset: -0.008 ± 0.000
Gaussian Area: 0.00 ± 0.33
Lorentzian Area: 159.83 ± 0.53
Total Area: 159.83 ± 0.63
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.56%
Overall Percentage is 100.00% ± 0.56%

Peak Fitting Results:
===================

Peak 1 (Position: -25.93 ± 0.03):
Amplitude: 0.915 ± 0.001
Width: 78.52 ± 0.11 in ppm
Width: 36952.62 ± 52.45 in Hz
Eta: 1.00 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 0.00 ± 0.29
Lorentzian Area: 112.88 ± 0.47
Total Area: 112.88 ± 0.55
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.69%
Overall Percentage is 100.00% ± 0.69%

Peak Fitting Results:
===================

Peak 1 (Position: -27.19 ± 0.02):
Amplitude: 0.919 ± 0.001
Width: 52.66 ± 0.09 in ppm
Width: 24780.67 ± 41.03 in Hz
Eta: 1.00 ± 0.00
Offset: 0.008 ± 0.000
Gaussian Area: 0.00 ± 0.22
Lorentzian Area: 76.00 ± 0.35
Total Area: 76.00 ± 0.41
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.77%
Overall Percentage is 100.00% ± 0.77%

Peak Fitting Results:
===================

Peak 1 (Position: -27.34 ± 0.01):
Amplitude: 0.949 ± 0.001
Width: 35.64 ± 0.05 in ppm
Width: 16773.96 ± 23.21 in Hz
Eta: 1.00 ± 0.00
Offset: 0.008 ± 0.000
Gaussian Area: 0.00 ± 0.12
Lorentzian Area: 53.13 ± 0.20
Total Area: 53.13 ± 0.24
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.63%
Overall Percentage is 100.00% ± 0.63%

Peak Fitting Results:
===================

Peak 1 (Position: -27.77 ± 0.01):
Amplitude: 0.979 ± 0.001
Width: 25.35 ± 0.02 in ppm
Width: 11929.97 ± 11.49 in Hz
Eta: 1.00 ± 0.00
Offset: 0.005 ± 0.000
Gaussian Area: 0.00 ± 0.06
Lorentzian Area: 38.98 ± 0.10
Total Area: 38.98 ± 0.12
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.43%
Overall Percentage is 100.00% ± 0.43%

Peak Fitting Results:
===================

Peak 1 (Position: -27.40 ± 0.00):
Amplitude: 1.002 ± 0.000
Width: 18.95 ± 0.01 in ppm
Width: 8917.42 ± 3.96 in Hz
Eta: 1.00 ± 0.00
Offset: 0.003 ± 0.000
Gaussian Area: 0.00 ± 0.02
Lorentzian Area: 29.82 ± 0.04
Total Area: 29.82 ± 0.04
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.20%
Overall Percentage is 100.00% ± 0.20%

Peak Fitting Results:
===================

Peak 1 (Position: -27.51 ± 0.00):
Amplitude: 1.001 ± 0.000
Width: 15.30 ± 0.01 in ppm
Width: 7200.74 ± 3.34 in Hz
Eta: 0.83 ± 0.00
Offset: 0.002 ± 0.000
Gaussian Area: 2.69 ± 0.02
Lorentzian Area: 20.08 ± 0.03
Total Area: 22.77 ± 0.04
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.22%
Overall Percentage is 100.00% ± 0.22%

Peak Fitting Results:
===================

Peak 1 (Position: -27.98 ± 0.00):
Amplitude: 1.000 ± 0.000
Width: 12.54 ± 0.00 in ppm
Width: 5902.67 ± 1.37 in Hz
Eta: 0.70 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 3.96 ± 0.01
Lorentzian Area: 13.86 ± 0.01
Total Area: 17.82 ± 0.01
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.12%
Overall Percentage is 100.00% ± 0.12%

Peak Fitting Results:
===================

Peak 1 (Position: -28.11 ± 0.00):
Amplitude: 0.999 ± 0.000
Width: 10.48 ± 0.00 in ppm
Width: 4933.16 ± 1.13 in Hz
Eta: 0.58 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 4.65 ± 0.01
Lorentzian Area: 9.59 ± 0.01
Total Area: 14.24 ± 0.01
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.13%
Overall Percentage is 100.00% ± 0.13%

Peak Fitting Results:
===================

Peak 1 (Position: -28.08 ± 0.00):
Amplitude: 0.998 ± 0.000
Width: 9.03 ± 0.00 in ppm
Width: 4251.81 ± 0.83 in Hz
Eta: 0.47 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 5.05 ± 0.01
Lorentzian Area: 6.72 ± 0.01
Total Area: 11.77 ± 0.01
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.11%
Overall Percentage is 100.00% ± 0.11%

Peak Fitting Results:
===================

Peak 1 (Position: -27.92 ± 0.00):
Amplitude: 0.996 ± 0.000
Width: 8.04 ± 0.00 in ppm
Width: 3785.87 ± 0.71 in Hz
Eta: 0.39 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 5.20 ± 0.00
Lorentzian Area: 4.91 ± 0.01
Total Area: 10.11 ± 0.01
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.12%
Overall Percentage is 100.00% ± 0.12%

Peak Fitting Results:
===================

Peak 1 (Position: -27.68 ± 0.00):
Amplitude: 0.991 ± 0.000
Width: 7.34 ± 0.00 in ppm
Width: 3452.30 ± 0.87 in Hz
Eta: 0.33 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 5.15 ± 0.01
Lorentzian Area: 3.82 ± 0.01
Total Area: 8.97 ± 0.01
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.16%
Overall Percentage is 100.00% ± 0.16%

c:\Users\babdulkadirola\AppData\Local\mambaforge\envs\miniconda\lib\site-packages\nmrlineshapeanalyser\core.py:448: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`). Consider using `matplotlib.pyplot.close()`.
  fig, ax1 = plt.subplots(1, 1, figsize=(12, 10))


Peak Fitting Results:
===================

Peak 1 (Position: -27.44 ± 0.00):
Amplitude: 0.986 ± 0.000
Width: 6.90 ± 0.00 in ppm
Width: 3246.72 ± 1.05 in Hz
Eta: 0.31 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 5.01 ± 0.01
Lorentzian Area: 3.29 ± 0.01
Total Area: 8.30 ± 0.01
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.21%
Overall Percentage is 100.00% ± 0.21%

c:\Users\babdulkadirola\AppData\Local\mambaforge\envs\miniconda\lib\site-packages\nmrlineshapeanalyser\core.py:448: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`). Consider using `matplotlib.pyplot.close()`.
  fig, ax1 = plt.subplots(1, 1, figsize=(12, 10))


Peak Fitting Results:
===================

Peak 1 (Position: -27.18 ± 0.00):
Amplitude: 0.983 ± 0.000
Width: 6.63 ± 0.00 in ppm
Width: 3119.50 ± 1.28 in Hz
Eta: 0.29 ± 0.00
Offset: 0.001 ± 0.000
Gaussian Area: 4.89 ± 0.01
Lorentzian Area: 3.02 ± 0.01
Total Area: 7.90 ± 0.02
--------------------------------------------------
Peak 1 Percentage is 100.00% ± 0.27%
Overall Percentage is 100.00% ± 0.27%

$No description has been provided for this image$

x data, which is the temperature in Kelvin, can be created using linspace function from numpy. Where the first argument is the starting temperature in Celsius, the second argument is the ending temperature in Celsius, and the third argument is the number of points to be generated between the starting and ending temperatures (which is the length of the widths/filepaths).

x_data = np.linspace(-80, 120, len(widths))

x_data = x_data + 273.15

# print(x_data)

Also, the Temperature in Kelvin can be pre-stored in a csv file such that each filepath in filepaths corresponds to an NMR experimental number and the corresponding temperature in Kelvin. So you have to create a csv file (named: temperature_points.csv) with two columns: column 1 is the filepath and column 2 is the temperature in Kelvin. The csv file should be stored in the same directory as the lineshape folder

# parent = os.path.dirname(directory)

# csv_path = os.path.join(parent, 'temperature_points.csv')

# df = pd.read_csv(csv_path, sep='\t', header=None)

# x_data = df[1]

# print(df.head(25))

lineshape_data = pd.DataFrame({'Exp. no.': filepaths, 'Temperature (K)': x_data, 

                    'Widths (kHz)': widths, 'Widths Error (kHz)': width_errors})

print(lineshape_data)

   Exp. no.  Temperature (K)  Widths (kHz)  Widths Error (kHz)
0         1           193.15     85.961803            0.061512
1         3           203.15     85.495006            0.168749
2         4           213.15     84.564987            0.056896
3         6           223.15     82.514484            0.051145
4         8           233.15     79.870233            0.047599
5        10           243.15     74.760528            0.032712
6        12           253.15     66.435620            0.038308
7        14           263.15     50.619596            0.055557
8        16           273.15     36.952618            0.052455
9        18           283.15     24.780674            0.041032
10       20           293.15     16.773963            0.023206
11       22           303.15     11.929973            0.011493
12       24           313.15      8.917421            0.003964
13       26           323.15      7.200743            0.003337
14       28           333.15      5.902666            0.001366
15       30           343.15      4.933160            0.001128
16       32           353.15      4.251806            0.000830
17       34           363.15      3.785870            0.000711
18       36           373.15      3.452299            0.000868
19       38           383.15      3.246723            0.001051
20       40           393.15      3.119499            0.001282

def HB_equation (x_data, B, Ea, A, D):
    k = 8.62*10**-5
    return (A / (1 + ((A / B) - 1) * np.exp(-Ea / (k * x_data)))) + D

def HB_analysis (x_data, y_data, y_error, initial_guesses, file_name):

    #initial guesses should be written as [B_guess, Ea_guess, A_guess, D_guess]

    #if u guess wrong, the fit would misbehave. try another guess values if such happens

    #x_data should be in Kelvin #y_data should be in kHz : that is the imported data

    k = 8.61733034e-5 #to be expressed in eV

    #now, we are defining the HB equation


    #now we have to run the fitting

    popt, pcov = curve_fit(HB_equation, x_data, y_data, p0=initial_guesses, maxfev=5000)


    #storing the optimised parameters in a tag

    B_opt, Ea_opt, A_opt, D_opt = popt

    #the diagonal of pcov contains the error of each parameter

    perr = np.sqrt(np.diag(pcov))

    #store the respective errors in a tag

    B_err, Ea_err, A_err, D_err = perr

    # determining the quality of the fit

    ##predict the model, the HB model, using the optimized parameters
    #x_fit = np.linspace(np.min(x_data), np.max(x_data), 1000) #would affect determination of r2
    y_fit = HB_equation(x_data, B_opt, Ea_opt, A_opt, D_opt)

    ##predicting the r square

    squaredDiffs = np.square(y_data - y_fit)

    squaredDiffsFromMean = np.square(y_data - np.mean(y_data))

    rsquared = 1 - np.sum(squaredDiffs)/np.sum(squaredDiffsFromMean)

    print(f"R² = {rsquared}")

    #Plotting the results

    #_____________________________________________________________________________#



    # inspect the parameters
    print(f"w = [{A_opt} / 1 + ({A_opt}/{B_opt} - 1) * e^(-{Ea_opt}/k*T)] + {D_opt}")

    #print the errors

    perr = np.sqrt(np.diag(pcov))

    #store the respective errors in a tag

    B_err, Ea_err, A_err, D_err = perr

    #print the optimised variables and their errors

    print('B:', f'{B_opt} +/- {B_err}')

    print('Ea:', f'{Ea_opt} +/- {Ea_err}')

    print('A:', f'{A_opt} +/- {A_err}')

    print('D:', f'{D_opt} +/- {D_err}')

    params_data = pd.DataFrame({'Parameters': ['B', 'Ea', 'A', 'D'],
                                'Values': [B_opt, Ea_opt, A_opt, D_opt], 'Errors': [B_err, Ea_err, A_err, D_err]})



    params_data.to_csv(file_name+'HB_parameters.csv')


    return y_fit, Ea_opt, Ea_err, B_opt, A_opt, D_opt

y_fit, _, _, _, _, _= HB_analysis(x_data, widths, width_errors, [1e-7, 0.6, 45, 3], directory)

#Initual guess values: [B_guess, Ea_guess, A_guess, D_guess] == [1e-7, 0.6, 45, 3]; change these based on your system

R² = 0.999705047871229
w = [81.91524646792928 / 1 + (81.91524646792928/5.582261696551946e-07 - 1) * e^(-0.43431741346508856/k*T)] + 3.3891625803320578
B: 5.582261696551946e-07 +/- 2.017347895654216e-07
Ea: 0.43431741346508856 +/- 0.008295485984395043
A: 81.91524646792928 +/- 0.46374683606992884
D: 3.3891625803320578 +/- 0.26439736857916435

#You can add the results to the lineshape_data dataframe

lineshape_data['widths_fit (kHz)'] = y_fit

print(lineshape_data)

   Exp. no.  Temperature (K)  Widths (kHz)  Widths Error (kHz)  \
0         1           193.15     85.961803            0.061512   
1         3           203.15     85.495006            0.168749   
2         4           213.15     84.564987            0.056896   
3         6           223.15     82.514484            0.051145   
4         8           233.15     79.870233            0.047599   
5        10           243.15     74.760528            0.032712   
6        12           253.15     66.435620            0.038308   
7        14           263.15     50.619596            0.055557   
8        16           273.15     36.952618            0.052455   
9        18           283.15     24.780674            0.041032   
10       20           293.15     16.773963            0.023206   
11       22           303.15     11.929973            0.011493   
12       24           313.15      8.917421            0.003964   
13       26           323.15      7.200743            0.003337   
14       28           333.15      5.902666            0.001366   
15       30           343.15      4.933160            0.001128   
16       32           353.15      4.251806            0.000830   
17       34           363.15      3.785870            0.000711   
18       36           373.15      3.452299            0.000868   
19       38           383.15      3.246723            0.001051   
20       40           393.15      3.119499            0.001282   

    widths_fit (kHz)  
0          85.248088  
1          85.101380  
2          84.657962  
3          83.467155  
4          80.636496  
5          74.808001  
6          64.831148  
7          51.295612  
8          37.086086  
9          25.262342  
10         16.958252  
11         11.680556  
12          8.477990  
13          6.559418  
14          5.403664  
15          4.697088  
16          4.256884  
17          3.976995  
18          3.795367  
19          3.675142  
20          3.594044

plt.errorbar(x_data, widths, yerr=width_errors, fmt='o', label= 'Widths')

plt.plot(x_data, y_fit, label='HB Fit')

plt.legend()

plt.xlabel('Temperature (K)')

plt.ylabel('FWHM (kHz)')

Text(0, 0.5, 'FWHM (kHz)')

If the visualisation of the fit and optimsed parameters are satisfactory, you can save the lineshape_data dataframe to a csv file just in case you would like to visualise the data in your preferred software

lineshape_data.to_csv(directory+'lineshape_data.csv')