mass_automation package#

Subpackages#

Submodules#

mass_automation.experiment module#

class mass_automation.experiment.Experiment(path: str, n_scans=None, n_points=None, format: Optional[str] = 'mzXML', verbose: Optional[bool] = True, suppress_caching=True)#

Bases: object

A class used to represent Experiment.

spectra_mass#

The list of spectra masses.

Type:

List

spectra_ints#

The list of spectra intensities

Type:

List

path#

The path to the file.

Type:

str

format#

The format of the file (default is ‘mzXML’).

Type:

str

n_scans#

The number of scans in the experiment.

Type:

int

n_points#

The number of points in the experiment (in millions).

Type:

int

add_to_index(function: str, params: Hashable, name: str)#

Adds item to the index

Parameters:
  • function (str) – name of the function, currently it is either “get_item” or “summarize”

  • params (Hashable) – parameters, used for calling the function. Must be immutable

  • name (str) – path to the cache file

check_in_index(function: str, params: Hashable) str#

Checks presence of particular function in index

Parameters:
  • function (str) – name of the function, currently it is either “get_item” or “summarize”

  • params (Hashable) – parameters, used for calling the function. Must be immutable

Returns:

path to the cache

Return type:

str

find_name()#

Looks for available name

get_chromatogram()#

Returns the array of total intensities of the spectra.

Gives an opportunity to estimate the approximate amount of the sample in specific spectrum.

Returns:

The array of total intensities of the spectra.

Return type:

np.ndarray

get_names()#

Helper function to get parent dir for the experiment file and it’s filename without extension

summarize(item1=None, item2=None, subtract: Optional[bool] = True, cache=False) Spectrum#

Summarizes the spectra of the experiment from a given interval.

The right threshold is not included in the summation.

Parameters:
  • item1 (int) – The number of the first spectrum in summation.

  • item2 (int) – The number of the last spectrum in summation + 1.

  • subtract (bool) – If True, the mean intensities of the spectra, which go after the summation interval, are substracted from the sum.

  • cache (bool) – If True, resulting spectrum is saved, else not.

Returns:

The resulted summarized spectrum.

Return type:

Spectrum

Raises:

ValueError – If item1 or item2 exceeds the number of specta in the experiment or item1 is bigger than item2.

to_chrom_align_net()#
to_sima(path, min_distance=0.01, algorithm='std', alpha=None)#
class mass_automation.experiment.Spectrum(masses: ndarray, ints: ndarray, n_scans: Optional[int] = None, n_points: Optional[int] = None, path: Optional[str] = None)#

Bases: object

A class used to represent Spectrum

masses#

The array of masses.

Type:

np.ndarray

ints#

The array of intensities.

Type:

np.ndarray

n_scans#

The number of scans.

Type:

int

n_points#

The number of millions of points.

Type:

int

deisotoped#

If True, the spectrum has been deisotoped.

Type:

bool

isotopic_distributions#

The classifier labels of the spectrum.

Type:

np.ndarray

get_slice(left_mass: float, right_mass: float) Spectrum#

Returns a subspectrom of the original spectrum within a specific mass interval `[left_mass, right_mass]`

Parameters:
  • left_mass (float) – The left limit of the interval

  • right_mass (float) – The right limit of the interval

Returns:

Subspectrum of the original spectrum. No caching applied

Return type:

Spectrum

save_state()#

Saves current state of the object into caching pickle file, set in `self.path`

to_msi_warp(min_distance=0.01, algorithm='std', alpha=None)#
vectorize(min_mass: ~typing.Optional[int] = 150, max_mass: ~typing.Optional[int] = 1000, delta_mass: ~typing.Optional[int] = 1, method: ~typing.Optional[~typing.Callable] = <function amax>, keep_state: ~typing.Optional[bool] = True, n_bins: ~typing.Optional[int] = None, normalize: ~typing.Optional[float] = None) ndarray#

Performs vectorization of the spectrum, where vector components are encoded by one of the available methods.

Parameters:
  • min_mass (int) – The left margin of the interval where the vectorization is performed (default is 150).

  • max_mass (int) – The right margin of the interval where the vectorization is performed (default is 1000).

  • delta_mass (int) – The length of the interval which is characterized by one spectrum’s vector component (default is 1).

  • method (Callable) –

    The method of vectorization. For instance, the following NumPy functions may be used

    • np.max - by maximal intensity in vector component intervals,

    • np.sum - by total intensity

    • np.mean - by mean intensity (default is np.max).

  • keep_state (bool) – Defines whether results will be cached or not.

  • n_bins (int) – Number of bins. If None the number of bins will be calculated from delta_mass parameter.

  • normalize (float) – If None, maximum value is used for normalization. If -1, the spectrum is not normalized. Other numbers are used as a normalization constants.

Returns:

The vector, where components are the numbers characterizing the spectrum intervals.

Return type:

np.ndarray

vectorize_by_convolution(min_mass: float, max_mass: float, n_bins: int, sigma: 1e-05, normalize: Optional[float] = None) np.ndarray#

Performs vectorization via convolution. Each peak is represented as a gaussian curve

Parameters:
  • min_mass (float) – The left margin of the interval where the vectorization is performed.

  • max_mass (float) – The right margin of the interval where the vectorization is performed.

  • n_bins (int) – The number of bins.

  • sigma (float) – The width of the lorentzian curve.

  • normalize (float) – If None, maximum value is used for normalization. If -1, the spectrum is not normalized. Other numbers are used as a normalization constants.

Return type:

np.ndarray

mass_automation.experiment.peak_pick(mzs, hs, min_distance=0.01, algorithm='std', alpha=None, verbose=False, threshold=None)#

mass_automation.plot module#

mass_automation.uncertainty module#

class mass_automation.uncertainty.BinaryProbaModelWrapper(model)#

Bases: object

predict(*args, **kwargs)#
class mass_automation.uncertainty.EnsembleWrapper(model_type, models)#

Bases: object

Wrapper for an ensemble of models.

predict(*args, **kwargs)#
predict_all(*args, **kwargs)#
predict_w_uncertainty(*args, **kwargs)#
mass_automation.uncertainty.compute_entropy(p)#
mass_automation.uncertainty.compute_mean_entropy(ps)#

mass_automation.utils module#

class mass_automation.utils.Element#

Bases: object

Ac = 89#
Ag = 47#
Al = 13#
Am = 95#
Ar = 18#
As = 33#
At = 85#
Au = 79#
B = 5#
Ba = 56#
Be = 4#
Bh = 107#
Bi = 83#
Bk = 97#
Br = 35#
C = 6#
Ca = 20#
Cd = 48#
Ce = 58#
Cf = 98#
Cl = 17#
Cm = 96#
Cn = 112#
Co = 27#
Cr = 24#
Cs = 55#
Cu = 29#
Db = 105#
Ds = 110#
Dy = 66#
Er = 68#
Es = 99#
Eu = 63#
F = 9#
Fe = 26#
Fl = 114#
Fm = 100#
Fr = 87#
Ga = 31#
Gd = 64#
Ge = 32#
H = 1#
He = 2#
Hf = 72#
Hg = 80#
Ho = 67#
Hs = 108#
I = 53#
In = 49#
Ir = 77#
K = 19#
Kr = 36#
La = 57#
Li = 3#
Lr = 103#
Lu = 71#
Lv = 116#
Mc = 115#
Md = 101#
Mg = 12#
Mn = 25#
Mo = 42#
Mt = 109#
N = 7#
Na = 11#
Nb = 41#
Nd = 60#
Ne = 10#
Nh = 113#
Ni = 28#
No = 102#
Np = 93#
O = 8#
Og = 118#
Os = 76#
P = 15#
Pa = 91#
Pb = 82#
Pd = 46#
Pm = 61#
Po = 84#
Pr = 59#
Pt = 78#
Pu = 94#
Ra = 88#
Rb = 37#
Re = 75#
Rf = 104#
Rg = 111#
Rh = 45#
Rn = 86#
Ru = 44#
S = 16#
Sb = 51#
Sc = 21#
Se = 34#
Sg = 106#
Si = 14#
Sm = 62#
Sn = 50#
Sr = 38#
Ta = 73#
Tb = 65#
Tc = 43#
Te = 52#
Th = 90#
Ti = 22#
Tl = 81#
Tm = 69#
Ts = 117#
U = 92#
V = 23#
W = 74#
Xe = 54#
Y = 39#
Yb = 70#
Zn = 30#
Zr = 40#
n_elements = 119#
mass_automation.utils.lorentzian(x, x0, gam)#

Module contents#