From: Joshua K. <mob...@gm...> - 2019-01-09 02:08:44
|
Hello, I'm the author of the psims, and unfortunately it doesn't actually do any reading itself, just writing. Internally, it uses Pyteomics ( https://pypi.org/project/pyteomics/) to read mzML files, which does decode the array data for you. Pyteomics is great in that it gets you the data into memory in a structure you can see through and then gets out of the way. There's also pymzML (https://pypi.org/project/pymzML/), which provides several additional tools to then manipulate the spectrum. And of course, pyOpenMS. While we're assuming Python is the target language, Hannes provided great self-contained examples, but here's a function that lets you indicate your data were compressed, and unpacks the data into a numpy array: ```python import base64 import zlib import numpy as np def decode_array(bytestring, compressed=False, dtype=np.float32): # make sure we're dealing with bytes try: decoded_string = bytestring.encode("ascii") except AttributeError: decoded_string = bytestring decoded_string = base64.decodestring(decoded_string) if compressed: decoded_string = zlib.decompress(decoded_string) array = np.fromstring(decoded_string, dtype=dtype) return array ``` On Tue, Jan 8, 2019 at 8:10 PM Eric Deutsch <ede...@sy...> wrote: > Hi Da Qi, there are many libraries out there for many languages that read > mzML, so you may not need to re-implement it yourself. If Python is a > language of choice, you might look at this recent paper: > > > > http://www.mcponline.org/content/early/2018/12/18/mcp.RP118.001070?papetoc > = > > > > > > > > *From:* Shofstahl, Jim <jim...@th...> > *Sent:* Tuesday, January 8, 2019 6:50 AM > *To:* Mass spectrometry standard development < > psi...@li...> > *Subject:* Re: [Psidev-ms-dev] mzML base64binary conversion > > > > The spectral data (and chromatogram data) is stored as a Base64 encoded > string. Most of the programming languages will contain methods/functions > to convert to and from the > > Base64 format. In C# those methods are part of the Convert class. > > > > > https://docs.microsoft.com/en-us/dotnet/api/system.convert?view=netframework-4.7.2 > > > > The FromBase64String method takes a string value and converts it to a > byte[] array which can then be converted to another data type. We tend to > convert it using > > as a MemoryStream object but there are other ways to do the conversion: > > > > protected static float[] ConvertToFloat(byte[] array, int points) > > { > > var arrayValues = new float[points]; > > > > using (var memStream = new MemoryStream(array)) > > { > > using (var binReader = new BinaryReader(memStream)) > > { > > for (var i = 0; i < points; i++) > > { > > arrayValues[i] = binReader.ReadSingle(); > > } > > } > > } > > > > // Return the value(s) > > return arrayValues; > > } > > > > *From:* 戚达(Da Qi) <qi...@ge...> > *Sent:* Monday, January 7, 2019 5:43 PM > *To:* psi...@li... > *Subject:* [Psidev-ms-dev] mzML base64binary conversion > > > > *CAUTION:* This email originated from outside of the organization. Do not > click links or open attachments unless you recognize the sender and know > the content is safe. > > > > Hi, > > > > Does anyone know how to decode the binary string to real data in mzML? > > > > <binaryDataArrayList count="2"> > > > <binaryDataArray encodedLength="268"> > > > <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitCvRef="MS" > unitAccession="MS:1000040" unitName="m/z"/> > > > <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> > > > <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> > > > <binary>O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=</binary> > > > </binaryDataArray> > > > <binaryDataArray encodedLength="268"> > > > <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" > unitCvRef="MS" unitAccession="MS:1000131" unitName="number of counts"/> > > > <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> > > > <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> > > > <binary>AEB6RADQakUAwOxEAGwGRgDgE0UAcDxFAMDIRABASEQA4N5EAMBWRAAAJ0UAIAZFADA+RQBgu0QAAB5EAMB+RQDAhUUAwLREANCpRQBwA0UAABFEAEAsRADgJkUAMAlFAOAbRQDAlEQAAKZEALBBRQAgpEQA4F1FAMDBRADAi0QAIDZFAIBFRQAAEEUAoHRFAKCMRACAL0QAINtEAGAKRQDAO0QAQONFAHAIRQD4gUUAgONDAGDaRADQLkUAAGJEAABNRAAA0EQ=</binary> > > > </binaryDataArray> > > </binaryDataArrayList> > > > > Best, > > Da Qi > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |