carbones Wiki

CARBONES.eu web application

Brought to you by: marktjackson

HowToSetupPeriodAndVariableBasedDatabaseStructure

Step 1. Converting .data Files into .txt Files

Once you have recieved the data files, please check that they each have unique file names and exist within a single directory. If they are divided by unique folder names, the conversion software will require modification.

If the data files are divided correctly, or no major software modification is required, you must open the conversion software with visual studio and modify the directory path and possibly variable names. The software can be downloaded from the carbones svn repository.

PATHS For the software: Carbones repository For pre-existing data: Q:\CRs\Utilities\CARBONES_F862\Website\CR0013 Importing New Graph Data\Import TR: Q:\CRs\Utilities\CARBONES_F862\Website\CR0013 Importing New Graph Data

Update: This program has been updated so it will not look for Fnee_gras, _crop, or _for but rather Fneegras, fneecrop etc. Due to problems with the .exe. As such you will have to run a .bat file to rename the .data files in your spatially aggregated data folder. A copy of this can be found at Q:\CRs\Utilities\CARBONES_F862\Website\CR0013 Importing New Graph Data\renameforgrascrop.bat or on the Carbones data backup hard drive.Rename the outputted text files back to their original names with underscores in, e.g. Fnee_crop.txt.

You will want to verify your text files against the .data, and pre-existing text files for data consistency and structure consistency respectively.

Once you have your text files, you also need to make sure you have data for the other tables. The following files are required to be updated and supplied to ensure maximum compatibility with the new graph data: variable.txt, region.txt, series.txt, temporal_filtering.txt, averaging_period.txt, units.txt (units required to avoid errors, but possibly redundant data).

Step 2. Construct New Database on Server

To create the database upon up PuTTY and logon to the server with user root. Next, with type in the following command, susbstituting dbname with an appropriate name:

su postgres;
psql -c "CREATE DATABASE carbones_graph<number> OWNER carbones";

Now you need to change the user to carbones (may have to reopen putty):

su carbones;
psql carbones_graph<db number>;

Logon to cpanel, or run from Putty the following SQL to construct the database correctly: (NOTE - Do not try running this all at once, as it may cause problems with Putty disconnects)

--
-- PostgreSQL database dump
--

SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;

--
-- Name: plpgsql; Type: PROCEDURAL LANGUAGE; Schema: -; Owner: postgres
--

CREATE OR REPLACE PROCEDURAL LANGUAGE plpgsql;


-- ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO postgres;

SET search_path = public, pg_catalog;

--
-- Name: dup_result; Type: TYPE; Schema: public; Owner: carbones
--

CREATE TYPE dup_result AS (
    startdatefromdb timestamp without time zone,
    value double precision
);


SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: averaging_period; Type: TABLE; Schema: public; Owner: carbones; Tablespace: 
--

CREATE TABLE averaging_period (
    averaging_period_id integer NOT NULL,
    averaging_period_name text NOT NULL
);


-- ALTER TABLE public.averaging_period OWNER TO carbones;

--
-- Name: TABLE averaging_period; Type: COMMENT; Schema: public; Owner: carbones
--

COMMENT ON TABLE averaging_period IS 'Contains the time unit used for averaging, e.g. month';


--
-- Name: averaging_period_id_seq; Type: SEQUENCE; Schema: public; Owner: carbones
--

CREATE SEQUENCE averaging_period_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


-- ALTER TABLE public.averaging_period_id_seq OWNER TO carbones;

--
-- Name: averaging_period_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: carbones
--

ALTER SEQUENCE averaging_period_id_seq OWNED BY averaging_period.averaging_period_id;


--
-- Name: graphdata; Type: TABLE; Schema: public; Owner: carbones; Tablespace: 
--

CREATE TABLE graphdata (
    startdate timestamp without time zone NOT NULL,
    value double precision NOT NULL,
    series_id integer NOT NULL,
    region_id integer NOT NULL
);


-- ALTER TABLE public.graphdata OWNER TO carbones;

--
-- Name: TABLE graphdata; Type: COMMENT; Schema: public; Owner: carbones
--

COMMENT ON TABLE graphdata IS 'Contains the discrete values for each combination of region, averaging_period, temporal_filtering, variable and datetime';


--
-- Name: region; Type: TABLE; Schema: public; Owner: carbones; Tablespace: 
--

CREATE TABLE region (
    region_id integer NOT NULL,
    region_code text,
    region_name text,
    island boolean,
    isocean boolean
);


-- ALTER TABLE public.region OWNER TO carbones;

--
-- Name: TABLE region; Type: COMMENT; Schema: public; Owner: carbones
--

COMMENT ON TABLE region IS 'Contains information about the geographical regions this data applies to';


--
-- Name: series; Type: TABLE; Schema: public; Owner: carbones; Tablespace: 
--

CREATE TABLE series (
    series_id integer NOT NULL,
    variable_id integer NOT NULL,
    averaging_period_id integer NOT NULL,
    temporal_filtering_id integer NOT NULL
);


-- ALTER TABLE public.series OWNER TO carbones;

--
-- Name: temporal_filtering; Type: TABLE; Schema: public; Owner: carbones; Tablespace: 
--

CREATE TABLE temporal_filtering (
    temporal_filtering_id integer NOT NULL,
    temporal_filtering_name text NOT NULL
);


-- ALTER TABLE public.temporal_filtering OWNER TO carbones;

--
-- Name: TABLE temporal_filtering; Type: COMMENT; Schema: public; Owner: carbones
--

COMMENT ON TABLE temporal_filtering IS 'Contains the averaging type used to provide this data, e.g. mean';


--
-- Name: units; Type: TABLE; Schema: public; Owner: carbones; Tablespace: 
--

CREATE TABLE units (
    units_id integer NOT NULL,
    units_name text NOT NULL
);


-- ALTER TABLE public.units OWNER TO carbones;

--
-- Name: TABLE units; Type: COMMENT; Schema: public; Owner: carbones
--

COMMENT ON TABLE units IS 'Contains the measurement units the data is stored in';


--
-- Name: variable; Type: TABLE; Schema: public; Owner: carbones; Tablespace: 
--

CREATE TABLE variable (
    variable_id integer NOT NULL,
    variable_code text NOT NULL,
    variable_name text NOT NULL,
    units_id integer NOT NULL,
    isflux boolean DEFAULT true NOT NULL
);


-- ALTER TABLE public.variable OWNER TO carbones;

--
-- Name: TABLE variable; Type: COMMENT; Schema: public; Owner: carbones
--

COMMENT ON TABLE variable IS 'Contains information about the flux/stock for which this data is related to, e.g. Photosynthetic Flux';


--
-- Name: fullset; Type: VIEW; Schema: public; Owner: carbones
--

CREATE VIEW fullset AS
    SELECT v.variable_name AS variable, p.averaging_period_name AS averaging_period, f.temporal_filtering_name AS temporal_filtering, u.units_name AS units, r.region_name AS name, g.startdate, g.value FROM series s, averaging_period p, temporal_filtering f, variable v, units u, graphdata g, region r WHERE ((((((s.variable_id = v.variable_id) AND (s.averaging_period_id = p.averaging_period_id)) AND (s.temporal_filtering_id = f.temporal_filtering_id)) AND (v.units_id = u.units_id)) AND (g.series_id = s.series_id)) AND (g.region_id = r.region_id)) ORDER BY v.variable_name, p.averaging_period_name, f.temporal_filtering_name;


-- ALTER TABLE public.fullset OWNER TO carbones;

--
-- Name: noregions; Type: VIEW; Schema: public; Owner: carbones
--

CREATE VIEW noregions AS
    SELECT v.variable_code AS variable, p.averaging_period_name AS averaging_period, f.temporal_filtering_name AS temporal_filtering, u.units_name AS units FROM series s, averaging_period p, temporal_filtering f, variable v, units u WHERE ((((s.variable_id = v.variable_id) AND (s.averaging_period_id = p.averaging_period_id)) AND (s.temporal_filtering_id = f.temporal_filtering_id)) AND (v.units_id = u.units_id)) ORDER BY v.variable_code, p.averaging_period_name, f.temporal_filtering_name;


-- ALTER TABLE public.noregions OWNER TO carbones;

--
-- Name: region_id_seq; Type: SEQUENCE; Schema: public; Owner: carbones
--

CREATE SEQUENCE region_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


-- ALTER TABLE public.region_id_seq OWNER TO carbones;

--
-- Name: region_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: carbones
--

ALTER SEQUENCE region_id_seq OWNED BY region.region_id;


--
-- Name: series_id_seq; Type: SEQUENCE; Schema: public; Owner: carbones
--

CREATE SEQUENCE series_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


-- ALTER TABLE public.series_id_seq OWNER TO carbones;

--
-- Name: series_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: carbones
--

ALTER SEQUENCE series_id_seq OWNED BY series.series_id;


--
-- Name: serieslist; Type: VIEW; Schema: public; Owner: carbones
--

CREATE VIEW serieslist AS
    SELECT s.series_id, v.variable_code, ap.averaging_period_name, t.temporal_filtering_name FROM (((series s NATURAL JOIN variable v) NATURAL JOIN averaging_period ap) NATURAL JOIN temporal_filtering t) ORDER BY s.series_id;


-- ALTER TABLE public.serieslist OWNER TO carbones;

--
-- Name: temporal_filtering_id_seq; Type: SEQUENCE; Schema: public; Owner: carbones
--

CREATE SEQUENCE temporal_filtering_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


-- ALTER TABLE public.temporal_filtering_id_seq OWNER TO carbones;

--
-- Name: temporal_filtering_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: carbones
--

ALTER SEQUENCE temporal_filtering_id_seq OWNED BY temporal_filtering.temporal_filtering_id;

--
-- Name: units_id_seq; Type: SEQUENCE; Schema: public; Owner: carbones
--

CREATE SEQUENCE units_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


-- ALTER TABLE public.units_id_seq OWNER TO carbones;

--
-- Name: units_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: carbones
--

ALTER SEQUENCE units_id_seq OWNED BY units.units_id;


--
-- Name: variable_id_seq; Type: SEQUENCE; Schema: public; Owner: carbones
--

CREATE SEQUENCE variable_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


-- ALTER TABLE public.variable_id_seq OWNER TO carbones;

--
-- Name: variable_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: carbones
--

ALTER SEQUENCE variable_id_seq OWNED BY variable.variable_id;


--
-- Name: averaging_period_id; Type: DEFAULT; Schema: public; Owner: carbones
--

ALTER TABLE averaging_period ALTER COLUMN averaging_period_id SET DEFAULT nextval('averaging_period_id_seq'::regclass);


--
-- Name: region_id; Type: DEFAULT; Schema: public; Owner: carbones
--

ALTER TABLE region ALTER COLUMN region_id SET DEFAULT nextval('region_id_seq'::regclass);


--
-- Name: series_id; Type: DEFAULT; Schema: public; Owner: carbones
--

ALTER TABLE series ALTER COLUMN series_id SET DEFAULT nextval('series_id_seq'::regclass);


--
-- Name: temporal_filtering_id; Type: DEFAULT; Schema: public; Owner: carbones
--

ALTER TABLE temporal_filtering ALTER COLUMN temporal_filtering_id SET DEFAULT nextval('temporal_filtering_id_seq'::regclass);


--
-- Name: units_id; Type: DEFAULT; Schema: public; Owner: carbones
--

ALTER TABLE units ALTER COLUMN units_id SET DEFAULT nextval('units_id_seq'::regclass);


--
-- Name: variable_id; Type: DEFAULT; Schema: public; Owner: carbones
--

ALTER TABLE variable ALTER COLUMN variable_id SET DEFAULT nextval('variable_id_seq'::regclass);


--
-- Name: averaging_period_pkey; Type: CONSTRAINT; Schema: public; Owner: carbones; Tablespace: 
--

ALTER TABLE ONLY averaging_period
    ADD CONSTRAINT averaging_period_pkey PRIMARY KEY (averaging_period_id);


--
-- Name: composite; Type: CONSTRAINT; Schema: public; Owner: carbones; Tablespace: 
--

ALTER TABLE ONLY graphdata
    ADD CONSTRAINT composite UNIQUE (startdate, series_id, region_id);


--
-- Name: region_pkey; Type: CONSTRAINT; Schema: public; Owner: carbones; Tablespace: 
--

ALTER TABLE ONLY region
    ADD CONSTRAINT region_pkey PRIMARY KEY (region_id);


--
-- Name: series_pkey; Type: CONSTRAINT; Schema: public; Owner: carbones; Tablespace: 
--

ALTER TABLE ONLY series
    ADD CONSTRAINT series_pkey PRIMARY KEY (series_id);


--
-- Name: temporal_filtering_pkey; Type: CONSTRAINT; Schema: public; Owner: carbones; Tablespace: 
--

ALTER TABLE ONLY temporal_filtering
    ADD CONSTRAINT temporal_filtering_pkey PRIMARY KEY (temporal_filtering_id);


--
-- Name: units_pkey; Type: CONSTRAINT; Schema: public; Owner: carbones; Tablespace: 
--

ALTER TABLE ONLY units
    ADD CONSTRAINT units_pkey PRIMARY KEY (units_id);


--
-- Name: variable_pkey; Type: CONSTRAINT; Schema: public; Owner: carbones; Tablespace: 
--

ALTER TABLE ONLY variable
    ADD CONSTRAINT variable_pkey PRIMARY KEY (variable_id);


--
-- Name: av_period; Type: INDEX; Schema: public; Owner: carbones; Tablespace: 
--

CREATE INDEX av_period ON series USING btree (averaging_period_id);


--
-- Name: serie; Type: INDEX; Schema: public; Owner: carbones; Tablespace: 
--

CREATE UNIQUE INDEX serie ON series USING btree (variable_id, averaging_period_id, temporal_filtering_id);


--
-- Name: temp_filt; Type: INDEX; Schema: public; Owner: carbones; Tablespace: 
--

CREATE INDEX temp_filt ON series USING btree (temporal_filtering_id);


--
-- Name: variable_id; Type: INDEX; Schema: public; Owner: carbones; Tablespace: 
--

CREATE INDEX variable_id ON series USING btree (variable_id);


--
-- Name: averaging_period; Type: FK CONSTRAINT; Schema: public; Owner: carbones
--

ALTER TABLE ONLY series
    ADD CONSTRAINT averaging_period FOREIGN KEY (averaging_period_id) REFERENCES averaging_period(averaging_period_id);


--
-- Name: temporal_filtering; Type: FK CONSTRAINT; Schema: public; Owner: carbones
--

ALTER TABLE ONLY series
    ADD CONSTRAINT temporal_filtering FOREIGN KEY (temporal_filtering_id) REFERENCES temporal_filtering(temporal_filtering_id);


--
-- Name: units; Type: FK CONSTRAINT; Schema: public; Owner: carbones
--

ALTER TABLE ONLY variable
    ADD CONSTRAINT units FOREIGN KEY (units_id) REFERENCES units(units_id);


--
-- Name: variable; Type: FK CONSTRAINT; Schema: public; Owner: carbones
--

ALTER TABLE ONLY series
    ADD CONSTRAINT variable FOREIGN KEY (variable_id) REFERENCES variable(variable_id);


--
-- Name: public; Type: ACL; Schema: -; Owner: postgres
--

REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM postgres;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO PUBLIC;


--
-- PostgreSQL database dump complete
--

Step 3. Putting New Data into New Database on Server

The first step is to make a connection to the server with filezilla. Once this has been established, find your text files once again. On the local machine, divide the data into folders of around 1GB each and then compress them into a .zip folder using WinRAR. The next thing to do is create a suitable directory in filezilla, and start uploading the new .zip folders here. At this point it would be a good idea to check how much free space the server has, and clear as much as possible before proceeding any further.

Once all data has been uploaded, it requires unzipping. Once again, open Putty and run the following commands substituting FolderName each time for the names of correct subdirectories.

CD /home/carbones/foldername/

unzip zippeddatafolder.zip

Once this has finished delete the .zip files to immediately clear some space. Back in PuTTY it is now time to start adding data to the database itself from the .txt files. To do this run the following command (note that this will take several hours!):

psql -c "
COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Babove.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Bbelow.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Btot.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Fbbur.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Fbbur_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Ffos.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Ffos_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Fgpp.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Fgpp_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Fland.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Fland_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Fle.txt'
WITH DELIMITER ','
CSV HEADER;


COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set1/Fle_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fnatural.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fnatural_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fnbp.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fnbp_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fnee.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fnee_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Foce.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Foce_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fresp.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fresp_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fsens.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Fsens_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Ftot.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/Ftot_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/LAI.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set2/SOC.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fnee_for.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fnee_for_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fnee_gras.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fnee_gras_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fnee_crop.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fnee_crop_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Babove_for.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Bbelow_for.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fgpp_for.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fgpp_for_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fgpp_gras.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fgpp_gras_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fgpp_crop.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fgpp_crop_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fresp_for.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fresp_for_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fresp_gras.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fresp_gras_raw.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fresp_crop.txt'
WITH DELIMITER ','
CSV HEADER;

COPY graphdata(startdate, "value", series_id, region_id)
FROM '/home/carbones/CarbonesV1/Set3/Fresp_crop_raw.txt'
WITH DELIMITER ','
CSV HEADER;" dbname;


COPY units("units_id","units_name")
FROM '/home/carbones/newgraphdata/Data/units.txt'
WITH DELIMITER E'\t'
CSV HEADER;

COPY variable("variable_id","variable_code","variable_name","units_id","isflux")
FROM '/home/carbones/newgraphdata/Data/variable.txt'
WITH DELIMITER E'\t'
CSV HEADER;

COPY region("region_id","region_code","region_name","island","isocean")
FROM '/home/carbones/newgraphdata/Data/region.txt'
WITH DELIMITER E'\t'
CSV HEADER;

COPY temporal_filtering("temporal_filtering_id","temporal_filtering_name")
FROM '/home/carbones/newgraphdata/Data/temporal_filtering.txt'
WITH DELIMITER E'\t'
CSV HEADER;

COPY averaging_period("averaging_period_id","averaging_period_name")
FROM '/home/carbones/newgraphdata/Data/averagingperiod.txt'
WITH DELIMITER E'\t'
CSV HEADER;

COPY series("series_id","variable_id","averaging_period_id","temporal_filtering_id")
FROM '/home/carbones/newgraphdata/Data/series.txt'
WITH DELIMITER E'\t'
CSV HEADER;

Remember to change dbname and file paths!

Step 4. Add Functions and Split New Database Tables

This will setup the database so that graph production is much more efficient. The majority of work is simply to copy, paste and execute each segment of code ONE BY ONE and to finish off run VACUUM ANALYSE. Do not try running more than one at once with a ';' divider! It is a very long process (over 100 minutes) and you do not want to risk errors at the very end because your changes may not be applied.

Once the core data is in place (so you have the approx 60 million row sized graphdata table) run the following function to create tables. Please ensure that graphdata is named graphdata, otherwise please change any reference to this in the function below.

-- Function: maketables()

-- DROP FUNCTION maketables();

CREATE OR REPLACE FUNCTION maketables()
  RETURNS void AS
$BODY$
DECLARE
    rec  RECORD;
    rec2 RECORD;
    period TEXT;
    variable TEXT;
    tablename TEXT; 
BEGIN

-- For number of variables
FOR rec IN SELECT variable_code FROM variable 
LOOP

    -- For number of averaging periods
    FOR rec2 IN SELECT averaging_period_name FROM averaging_period
    LOOP    
        period = rec2.averaging_period_name;
        variable = rec.variable_code;

        tablename = 'graphdata_' || period;
        tablename = tablename || '_' || variable;

        --Start SQL
        EXECUTE '
        DROP TABLE IF EXISTS ' || tablename || ' ;

        CREATE TABLE ' || tablename || ' AS

        SELECT
            g.* 
        FROM 
            series s NATURAL JOIN
            averaging_period a NATURAL JOIN
            variable v NATURAL JOIN
            graphdata g 
        WHERE
            a.averaging_period_name = ' || quote_literal(period) || ' AND
            v.variable_code =' || quote_literal(variable) || 
        ' ORDER BY 
            g.startdate';

    END LOOP;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE

Once this function has been created, you can run it by entering

select maketables()

After this (long) process, please make sure that you run VACUUM ANALYSE, though ensure that you have ran the maketable() first and seperately, or it will through an error and maketables() may need to be done a second time (as mentioned, a long process - you do not want this to happen).

Next you will need to run this to ensure that the correct functions exist.

-- Function: linegraph_multiregionperiod(text, text, text, text[], timestamp without time zone, timestamp without time zone)

-- DROP FUNCTION linegraph_multiregionperiod(text, text, text, text[], timestamp without time zone, timestamp without time zone);

CREATE OR REPLACE FUNCTION linegraph_multiregionperiodvariable(variable text, period text, avtype text, region text[], startdatein timestamp without time zone, enddatein timestamp without time zone)
  RETURNS SETOF record AS
$BODY$
DECLARE
  ret_row record ;
  SQLText text;
  bFirstTime boolean;  
BEGIN
bFirstTime = True;
FOR i in array_lower(region, 1) .. array_upper(region, 1) LOOP

  IF bFirstTime Then
    SQLText = 'SUM(CASE WHEN r.region_code = ' || quote_literal(region[i]) || '::text THEN g.value END) AS A ';
    bFirstTime = False;
  ELSE
    SQLText =  SQLText || ', SUM(CASE WHEN r.region_code = ' || quote_literal(region[i]) || '::text THEN g.value END) AS B ';
  END IF;
END LOOP;

period = 'graphdata_' || period;
period = period || '_' || variable;

FOR ret_row in EXECUTE 'SELECT
  g.startdate, ' || SQLText ||
   'FROM 
      series s NATURAL JOIN      
      temporal_filtering f NATURAL JOIN

      ' || period || ' g NATURAL JOIN
      region r

   WHERE

      f.temporal_filtering_name = ' || quote_literal(avtype) || '::text AND
      g.startdate >= ' || quote_literal(startdatein) || '::timestamp without time zone AND
      g.startdate <= ' || quote_literal(enddatein) || '::timestamp without time zone

   GROUP BY g.startdate
   ORDER BY g.startdate'
   LOOP
return NEXT ret_row;
END LOOP;
 RETURN;
END;
 $BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100
  ROWS 10000;

CREATE OR REPLACE FUNCTION linegraphperiodvariable(variable text, period text, avtype text, region text, startdatein timestamp without time zone, enddatein timestamp without time zone)
  RETURNS SETOF dup_result AS
$BODY$
DECLARE
  ret_row dup_result ;

BEGIN
    period = 'graphdata_' || period;
    period = period || '_' || variable;

    FOR ret_row  in EXECUTE
    'SELECT g.startdate, g.value  
       FROM 
          series s NATURAL JOIN       
          temporal_filtering f NATURAL JOIN

          region r NATURAL JOIN '
          || period || ' g
       WHERE

          f.temporal_filtering_name = ' || quote_literal(avtype) || '::text AND
          r.region_code = ' || quote_literal(region) || '::text AND
          g.startdate >= ' || quote_literal(startdatein) || '::timestamp without time zone AND
          g.startdate <= ' || quote_literal(enddatein) || '::timestamp without time zone
       ORDER BY 
          g.startdate'

    LOOP
        return NEXT ret_row;
    END LOOP;

    RETURN;
END;
 $BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100
  ROWS 10000;

These are the linegraphperiodvariable and linegraph_multiregionperiodvariable which both rely on the new database structure to function correctly.

Step 5. Modify Permissions

It may be necessary to grant SELECT permissions on the new tables. This is necessary if the webapp is not using the PostgreSQL administrator login to access the database, which is best practise.

PostgreSQL permissions can be granted through the psql command prompt like this.

psql -U postgres
postgres=# \c <<database-name>>
database-name# GRANT SELECT ON ALL TABLES IN SCHEMA public TO <<user-name>>;
GRANT

Step 6. Modify Webapp

To use the new data it may be required that you modify the webapp to double check it is pointing to the correct database and using the correct functions to perform data searches.

These settings are in the file WEB-INF/lib/external_configuration/ carbones_local.properties

The database connection details are in these settings.

carbones.jdbc.url = jdbc:postgresql://localhost:5432/carbones_graph2
carbones.jdbc.driver = org.postgresql.Driver
carbones.jdbc.username = USERNAMEHERE
carbones.jdbc.password = ****

You may also need to change these settings.

TIMESERIES_SQL_STRING=select * from linegraphperiodvariable ( ? ,?,?,?,?,?)
TIMESERIES_MULTIREGION_SQL_STRING=select <<regions>> from linegraph_multiregionperiodvariable ( ? ,?,?,<<regionValues>>,?,?) As (<<regionTypes>>)

Once this is redeployed, double check everything is working.

carbones Wiki

CARBONES.eu web application

HowToSetupPeriodAndVariableBasedDatabaseStructure

Related