From: Stefan S. <ssc...@ss...> - 2006-08-06 17:35:41
|
Hello, I'm trying to extract the authors from a BibTeX file with a Python script and pybliographer 1.2.9. Currently, I use this script for testing: ----------------------------------------------------------------- #! /usr/bin/env python import os import sys # important though it _seems_ not be used (does some configuration # for module `Open`) import pybrc from Pyblio import Open if len(sys.argv) != 2: print "Usage: %s bibtex_file" % os.path.basename(sys.argv[0]) sys.exit() bibtex_file_name = sys.argv[1] database = Open.bibopen(bibtex_file_name) for entry in database.iterator(): print "===", entry.key.key, "===" print "--- Authors (LaTeX) ---" print entry.get_latex('author') print "--- Authors (parsed and recombined) ---" for author in entry.get('author').authors: print author print ----------------------------------------------------------------- As it probably should ;-) , entry.get_latex('author') returns the original LaTeX code for the complete author field while entry.get('author').authors[0] returns the first author as an converted string. I would like to split the author field but get the original LaTeX code for the _individual_ authors? For example, author = {M. J. Fabian\'{n}ska and Gra\.{z}yna Bzowska and Aniela Matuszewska} should be turned into a list (or similar data structure) of the three entries M. J. Fabian\'{n}ska Gra\.{z}yna Bzowska Aniela Matuszewska Is there a way to do this? Background of the question: - Especially in the above case, the conversion doesn't work; I get "Fabiannska, M. J.", "Bzowska, Gra.zyna" and "Matuszewska, Aniela" as authors. - The extracted entries are going to be included in another LaTeX file, so it's probably better to let LaTeX handle things like \'{n} or \.{z} . If it's not possible to get the individual authors as unescaped strings: What's the best way to fix the conversions? Will the conversion work if the result strings are Unicode? If yes, how can I use Unicode (unicode) here instead of byte strings (str)? Stefan |