Hi Olga,
This is more of a Tripal question, so I'm cc'ing the Tripal list.
Scott
On Thu, Jan 25, 2018 at 7:33 AM, Olga Klonova <Olg...@st...>
wrote:
> Hi,
>
> I have a question about a creating a template for the bulk loader. I am
> aware of the tutorial on tripal.org, which is absolutely useful, as well
> as some posts on the list, but my file seems to be too large to be fit into
> a Chado database using the steps from the tutorial.
>
> In the file each protein has multiple references to external databases
> (UniProt, PDB, PFAM) and EC number, information about relevant genes (GI
> number, GN, GO), the organism (Taxonomy ID, domain, class, family, genus,
> species), and some fields for the protein itself (ID, description, sequence
> length). All in all about 20 columns.
>
> Guess creating a template the usual way (as described in the tutorial)
> would lead to a mess, as there are too many fields to use and cross-link.
> So I wonder if there is a (recommended) way to load such data into Chado
> more efficiently. Would long format help (instead of the wide format, which
> is being used now)? Or is it better to split the data into several subsets
> and load each independently? Or re-organise the columns somehow?
>
> Any help would be very much appreciated.
>
> Olga
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Gmod-schema mailing list
> Gmo...@li...
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot
net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
|