From: 戚达(Da Q. <qi...@ge...> - 2019-01-08 03:17:42
|
Hi, Does anyone know how to decode the binary string to real data in mzML? <binaryDataArrayList count="2"> <binaryDataArray encodedLength="268"> <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitCvRef="MS" unitAccession="MS:1000040" unitName="m/z"/> <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> <binary>O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=</binary> </binaryDataArray> <binaryDataArray encodedLength="268"> <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" unitCvRef="MS" unitAccession="MS:1000131" unitName="number of counts"/> <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> <binary>AEB6RADQakUAwOxEAGwGRgDgE0UAcDxFAMDIRABASEQA4N5EAMBWRAAAJ0UAIAZFADA+RQBgu0QAAB5EAMB+RQDAhUUAwLREANCpRQBwA0UAABFEAEAsRADgJkUAMAlFAOAbRQDAlEQAAKZEALBBRQAgpEQA4F1FAMDBRADAi0QAIDZFAIBFRQAAEEUAoHRFAKCMRACAL0QAINtEAGAKRQDAO0QAQONFAHAIRQD4gUUAgONDAGDaRADQLkUAAGJEAABNRAAA0EQ=</binary> </binaryDataArray> </binaryDataArrayList> Best, Da Qi |
From: Shofstahl, J. <jim...@th...> - 2019-01-08 15:24:16
|
The spectral data (and chromatogram data) is stored as a Base64 encoded string. Most of the programming languages will contain methods/functions to convert to and from the Base64 format. In C# those methods are part of the Convert class. https://docs.microsoft.com/en-us/dotnet/api/system.convert?view=netframework-4.7.2 The FromBase64String method takes a string value and converts it to a byte[] array which can then be converted to another data type. We tend to convert it using as a MemoryStream object but there are other ways to do the conversion: protected static float[] ConvertToFloat(byte[] array, int points) { var arrayValues = new float[points]; using (var memStream = new MemoryStream(array)) { using (var binReader = new BinaryReader(memStream)) { for (var i = 0; i < points; i++) { arrayValues[i] = binReader.ReadSingle(); } } } // Return the value(s) return arrayValues; } From: 戚达(Da Qi) <qi...@ge...> Sent: Monday, January 7, 2019 5:43 PM To: psi...@li... Subject: [Psidev-ms-dev] mzML base64binary conversion CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi, Does anyone know how to decode the binary string to real data in mzML? <binaryDataArrayList count="2"> <binaryDataArray encodedLength="268"> <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitCvRef="MS" unitAccession="MS:1000040" unitName="m/z"/> <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> <binary>O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=</binary> </binaryDataArray> <binaryDataArray encodedLength="268"> <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" unitCvRef="MS" unitAccession="MS:1000131" unitName="number of counts"/> <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> <binary>AEB6RADQakUAwOxEAGwGRgDgE0UAcDxFAMDIRABASEQA4N5EAMBWRAAAJ0UAIAZFADA+RQBgu0QAAB5EAMB+RQDAhUUAwLREANCpRQBwA0UAABFEAEAsRADgJkUAMAlFAOAbRQDAlEQAAKZEALBBRQAgpEQA4F1FAMDBRADAi0QAIDZFAIBFRQAAEEUAoHRFAKCMRACAL0QAINtEAGAKRQDAO0QAQONFAHAIRQD4gUUAgONDAGDaRADQLkUAAGJEAABNRAAA0EQ=</binary> </binaryDataArray> </binaryDataArrayList> Best, Da Qi |
From: Eric D. <ede...@sy...> - 2019-01-09 01:10:00
|
Hi Da Qi, there are many libraries out there for many languages that read mzML, so you may not need to re-implement it yourself. If Python is a language of choice, you might look at this recent paper: http://www.mcponline.org/content/early/2018/12/18/mcp.RP118.001070?papetoc= *From:* Shofstahl, Jim <jim...@th...> *Sent:* Tuesday, January 8, 2019 6:50 AM *To:* Mass spectrometry standard development < psi...@li...> *Subject:* Re: [Psidev-ms-dev] mzML base64binary conversion The spectral data (and chromatogram data) is stored as a Base64 encoded string. Most of the programming languages will contain methods/functions to convert to and from the Base64 format. In C# those methods are part of the Convert class. https://docs.microsoft.com/en-us/dotnet/api/system.convert?view=netframework-4.7.2 The FromBase64String method takes a string value and converts it to a byte[] array which can then be converted to another data type. We tend to convert it using as a MemoryStream object but there are other ways to do the conversion: protected static float[] ConvertToFloat(byte[] array, int points) { var arrayValues = new float[points]; using (var memStream = new MemoryStream(array)) { using (var binReader = new BinaryReader(memStream)) { for (var i = 0; i < points; i++) { arrayValues[i] = binReader.ReadSingle(); } } } // Return the value(s) return arrayValues; } *From:* 戚达(Da Qi) <qi...@ge...> *Sent:* Monday, January 7, 2019 5:43 PM *To:* psi...@li... *Subject:* [Psidev-ms-dev] mzML base64binary conversion *CAUTION:* This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi, Does anyone know how to decode the binary string to real data in mzML? <binaryDataArrayList count="2"> <binaryDataArray encodedLength="268"> <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitCvRef="MS" unitAccession="MS:1000040" unitName="m/z"/> <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> <binary>O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=</binary> </binaryDataArray> <binaryDataArray encodedLength="268"> <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" unitCvRef="MS" unitAccession="MS:1000131" unitName="number of counts"/> <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> <binary>AEB6RADQakUAwOxEAGwGRgDgE0UAcDxFAMDIRABASEQA4N5EAMBWRAAAJ0UAIAZFADA+RQBgu0QAAB5EAMB+RQDAhUUAwLREANCpRQBwA0UAABFEAEAsRADgJkUAMAlFAOAbRQDAlEQAAKZEALBBRQAgpEQA4F1FAMDBRADAi0QAIDZFAIBFRQAAEEUAoHRFAKCMRACAL0QAINtEAGAKRQDAO0QAQONFAHAIRQD4gUUAgONDAGDaRADQLkUAAGJEAABNRAAA0EQ=</binary> </binaryDataArray> </binaryDataArrayList> Best, Da Qi |
From: Joshua K. <mob...@gm...> - 2019-01-09 02:08:44
|
Hello, I'm the author of the psims, and unfortunately it doesn't actually do any reading itself, just writing. Internally, it uses Pyteomics ( https://pypi.org/project/pyteomics/) to read mzML files, which does decode the array data for you. Pyteomics is great in that it gets you the data into memory in a structure you can see through and then gets out of the way. There's also pymzML (https://pypi.org/project/pymzML/), which provides several additional tools to then manipulate the spectrum. And of course, pyOpenMS. While we're assuming Python is the target language, Hannes provided great self-contained examples, but here's a function that lets you indicate your data were compressed, and unpacks the data into a numpy array: ```python import base64 import zlib import numpy as np def decode_array(bytestring, compressed=False, dtype=np.float32): # make sure we're dealing with bytes try: decoded_string = bytestring.encode("ascii") except AttributeError: decoded_string = bytestring decoded_string = base64.decodestring(decoded_string) if compressed: decoded_string = zlib.decompress(decoded_string) array = np.fromstring(decoded_string, dtype=dtype) return array ``` On Tue, Jan 8, 2019 at 8:10 PM Eric Deutsch <ede...@sy...> wrote: > Hi Da Qi, there are many libraries out there for many languages that read > mzML, so you may not need to re-implement it yourself. If Python is a > language of choice, you might look at this recent paper: > > > > http://www.mcponline.org/content/early/2018/12/18/mcp.RP118.001070?papetoc > = > > > > > > > > *From:* Shofstahl, Jim <jim...@th...> > *Sent:* Tuesday, January 8, 2019 6:50 AM > *To:* Mass spectrometry standard development < > psi...@li...> > *Subject:* Re: [Psidev-ms-dev] mzML base64binary conversion > > > > The spectral data (and chromatogram data) is stored as a Base64 encoded > string. Most of the programming languages will contain methods/functions > to convert to and from the > > Base64 format. In C# those methods are part of the Convert class. > > > > > https://docs.microsoft.com/en-us/dotnet/api/system.convert?view=netframework-4.7.2 > > > > The FromBase64String method takes a string value and converts it to a > byte[] array which can then be converted to another data type. We tend to > convert it using > > as a MemoryStream object but there are other ways to do the conversion: > > > > protected static float[] ConvertToFloat(byte[] array, int points) > > { > > var arrayValues = new float[points]; > > > > using (var memStream = new MemoryStream(array)) > > { > > using (var binReader = new BinaryReader(memStream)) > > { > > for (var i = 0; i < points; i++) > > { > > arrayValues[i] = binReader.ReadSingle(); > > } > > } > > } > > > > // Return the value(s) > > return arrayValues; > > } > > > > *From:* 戚达(Da Qi) <qi...@ge...> > *Sent:* Monday, January 7, 2019 5:43 PM > *To:* psi...@li... > *Subject:* [Psidev-ms-dev] mzML base64binary conversion > > > > *CAUTION:* This email originated from outside of the organization. Do not > click links or open attachments unless you recognize the sender and know > the content is safe. > > > > Hi, > > > > Does anyone know how to decode the binary string to real data in mzML? > > > > <binaryDataArrayList count="2"> > > > <binaryDataArray encodedLength="268"> > > > <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitCvRef="MS" > unitAccession="MS:1000040" unitName="m/z"/> > > > <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> > > > <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> > > > <binary>O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=</binary> > > > </binaryDataArray> > > > <binaryDataArray encodedLength="268"> > > > <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" > unitCvRef="MS" unitAccession="MS:1000131" unitName="number of counts"/> > > > <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> > > > <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> > > > <binary>AEB6RADQakUAwOxEAGwGRgDgE0UAcDxFAMDIRABASEQA4N5EAMBWRAAAJ0UAIAZFADA+RQBgu0QAAB5EAMB+RQDAhUUAwLREANCpRQBwA0UAABFEAEAsRADgJkUAMAlFAOAbRQDAlEQAAKZEALBBRQAgpEQA4F1FAMDBRADAi0QAIDZFAIBFRQAAEEUAoHRFAKCMRACAL0QAINtEAGAKRQDAO0QAQONFAHAIRQD4gUUAgONDAGDaRADQLkUAAGJEAABNRAAA0EQ=</binary> > > > </binaryDataArray> > > </binaryDataArrayList> > > > > Best, > > Da Qi > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: 戚达(Da Q. <qi...@ge...> - 2019-01-09 02:21:12
|
Hi Jim, Eric, Hannes Thanks for all your replies. Python is exactly what I am looking for. Best, Da 发件人: Eric Deutsch <ede...@sy...> 发送时间: 2019年1月9日 8:40 收件人: Mass spectrometry standard development <psi...@li...> 抄送: Eric Deutsch <ede...@sy...> 主题: Re: [Psidev-ms-dev] mzML base64binary conversion Hi Da Qi, there are many libraries out there for many languages that read mzML, so you may not need to re-implement it yourself. If Python is a language of choice, you might look at this recent paper: http://www.mcponline.org/content/early/2018/12/18/mcp.RP118.001070?papetoc= From: Shofstahl, Jim <jim...@th...<mailto:jim...@th...>> Sent: Tuesday, January 8, 2019 6:50 AM To: Mass spectrometry standard development <psi...@li...<mailto:psi...@li...>> Subject: Re: [Psidev-ms-dev] mzML base64binary conversion The spectral data (and chromatogram data) is stored as a Base64 encoded string. Most of the programming languages will contain methods/functions to convert to and from the Base64 format. In C# those methods are part of the Convert class. https://docs.microsoft.com/en-us/dotnet/api/system.convert?view=netframework-4.7.2 The FromBase64String method takes a string value and converts it to a byte[] array which can then be converted to another data type. We tend to convert it using as a MemoryStream object but there are other ways to do the conversion: protected static float[] ConvertToFloat(byte[] array, int points) { var arrayValues = new float[points]; using (var memStream = new MemoryStream(array)) { using (var binReader = new BinaryReader(memStream)) { for (var i = 0; i < points; i++) { arrayValues[i] = binReader.ReadSingle(); } } } // Return the value(s) return arrayValues; } From: 戚达(Da Qi) <qi...@ge...<mailto:qi...@ge...>> Sent: Monday, January 7, 2019 5:43 PM To: psi...@li...<mailto:psi...@li...> Subject: [Psidev-ms-dev] mzML base64binary conversion CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi, Does anyone know how to decode the binary string to real data in mzML? <binaryDataArrayList count="2"> <binaryDataArray encodedLength="268"> <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitCvRef="MS" unitAccession="MS:1000040" unitName="m/z"/> <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> <binary>O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=</binary> </binaryDataArray> <binaryDataArray encodedLength="268"> <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" unitCvRef="MS" unitAccession="MS:1000131" unitName="number of counts"/> <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> <binary>AEB6RADQakUAwOxEAGwGRgDgE0UAcDxFAMDIRABASEQA4N5EAMBWRAAAJ0UAIAZFADA+RQBgu0QAAB5EAMB+RQDAhUUAwLREANCpRQBwA0UAABFEAEAsRADgJkUAMAlFAOAbRQDAlEQAAKZEALBBRQAgpEQA4F1FAMDBRADAi0QAIDZFAIBFRQAAEEUAoHRFAKCMRACAL0QAINtEAGAKRQDAO0QAQONFAHAIRQD4gUUAgONDAGDaRADQLkUAAGJEAABNRAAA0EQ=</binary> </binaryDataArray> </binaryDataArrayList> Best, Da Qi |
From: Hannes R. <han...@gm...> - 2019-01-09 01:49:03
|
Hi Da Qi As others pointed out, there are multiple libraries that do this. However, a complete (and simple) solution in Python 2 would look like this: import struct coded="O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=" struct.unpack('<%sf' % (len( coded.decode('base64') ) // 4), coded.decode('base64')) or in Python 3: import struct, base64 coded=b"O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=" struct.unpack('<%sf' % (len( base64.decodebytes(coded) ) // 4), base64.decodebytes(coded)) Best regards Hannes On Mon, Jan 7, 2019 at 8:43 PM 戚达(Da Qi) <qi...@ge...> wrote: > Hi, > > > > Does anyone know how to decode the binary string to real data in mzML? > > > > <binaryDataArrayList count="2"> > > > <binaryDataArray encodedLength="268"> > > > <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitCvRef="MS" > unitAccession="MS:1000040" unitName="m/z"/> > > > <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> > > > <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> > > > <binary>O6+NQ43Hp0N978lDce3XQ6Zr5kORjf9DdUMLRCfJDEThihJEqO4SRGiZGERKzBpEKwcbRKjOKURoIS1EFJ4wRI1XMUQpvDdEIeA5RJjmQkQr70NERGNGRMeDR0RMh0hE2Y5JRG/KTEQncU5EHXJPRMupUETZ9lNEQiBURLS4VETjxVVEbTdYRMk2WkQ34VpEBhFbRN10W0Q5/FtExxteRAqvX0QSY2BEJZ5gRJ5HYUQ/jWFEEutkRGCNZURcx2VErg9vROF6fEQ=</binary> > > > </binaryDataArray> > > > <binaryDataArray encodedLength="268"> > > > <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" > unitCvRef="MS" unitAccession="MS:1000131" unitName="number of counts"/> > > > <cvParam cvRef="MS" accession="MS:1000576" name="no compression"/> > > > <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"/> > > > <binary>AEB6RADQakUAwOxEAGwGRgDgE0UAcDxFAMDIRABASEQA4N5EAMBWRAAAJ0UAIAZFADA+RQBgu0QAAB5EAMB+RQDAhUUAwLREANCpRQBwA0UAABFEAEAsRADgJkUAMAlFAOAbRQDAlEQAAKZEALBBRQAgpEQA4F1FAMDBRADAi0QAIDZFAIBFRQAAEEUAoHRFAKCMRACAL0QAINtEAGAKRQDAO0QAQONFAHAIRQD4gUUAgONDAGDaRADQLkUAAGJEAABNRAAA0EQ=</binary> > > > </binaryDataArray> > > </binaryDataArrayList> > > > > Best, > > Da Qi > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |