Before giving you access to the PanLex server and data, I'd like to know what kind(s) of access you wish to have. Among the options I can think of are:
(1) A dump of the database (PostgreSQL 9.1).
(2) Tab-delimited copies of specific tables.
(3) The right to connect remotely to the database on the server.
(4) The same Web and API access that the general public has.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-05-04
I would prefer a compressed dump of the sql db. At some point it would be nice to also have a web service/ API for quick online access for applications.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
We doubled the transmission speed to and from the PanLex server this weekend. Of course, this resulted in (or coincided with) failures that have so far eluded diagnosis. Troubleshooting continues. Until the problem is solved, access to the PanLex and Utilika websites may be erratic, and preparation of a dump of the database will be delayed. Thanks for your understanding.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I got the plx.sql dump last week, and I'm having some trouble restoring it into a PostgreSQL database. I'm not terribly familiar with PostgreSQL, and I've been following the instructions at https://help.ubuntu.com/community/PostgreSQL for installing it to my Ubuntu machine.
When I tried to do the following to restore the dump to a database called plx, I got a bunch of errors:
sudo -u postgres psql plx < plx.sql
The errors were mostly:
CREATE TABLE
ERROR: role "apache" does not exist
CREATE VIEW
ERROR: role "pool" does not exist
So I used pgadmin3 to add a login role "apache", and tried again. This time I many more errors:
CREATE SCHEMA
ERROR: role "pool" does not exist
ERROR: syntax error at or near "PROCEDURAL"
LINE 1: CREATE OR REPLACE PROCEDURAL LANGUAGE plperl;
ERROR: language "plpgsql" does not exist
ERROR: function apad(text, text, text, text, text, text, integer, integer, integer, text, text, text, text, text, text, integer) does not exist
Do you have any suggestions for restoring the dump to a database?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The PostgreSQL 9.0.1 documentation in section 24.1.1 says that the prerequisites to restoration of a plx dump include:
1. createdb -T template0 plx
2. all the users who own objects or were granted permissions on objects in the dumped database must already exist.
So I think it wasn't enough to give you a dump file. I also needed to give you a list of roles with any ownership or permissions in plx.
In principle, the only such roles should be "apache" and "reader", but I see I have probably included a couple of tables with "pool" as the owner, so I think you'd need to create those 3 roles before restoring. The fact that "pool" is demanded for creation of a schema surprises me, because I thought I included only the "public" schema in the dump, and its owner is "postgres".
The error with "CREATE OR REPLACE PROCEDURAL LANGUAGE" is also a surprise. That syntax is OK per the documentation. The languages installed in plx are:
createlang -l plx
Procedural Languages
Name | Trusted?
------+-------
plperl | yes
plperlu | no
plpgsql | yes
If the system responds that "plpgsql" does not exist, I presume that means it isn't installed. The documentation says "In PostgreSQL 9.0 and later, PL/pgSQL is installed by default."
I don't understand why it would be reported that the specified "apad" function does not exist. In plx there are 2 "apad" functions, and one has the specified signature:
\df apad
List of functions
Schema | Name | Result data type | Argument data types | Type
-----+----+------------+------------------------------------------------------------------------+-----
public | apad | integer | text, text, text, text, text, text, integer, integer, integer, text, text, text, text, text, integer | normal
public | apad | integer | text, text, text, text, text, text, integer, integer, integer, text, text, text, text, text, text, integer | normal
(2 rows)
From the above, can you eliminate any of these errors? If so, what errors remain?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the response. I added the users, and upgraded to PostgreSQL 9.1, since the default version in Ubuntu's repositories was 8.4. These two things seemed to help a bit, but I still get some errors, which seem to all be related to the language issue:
ERROR: could not access file "$libdir/plperl": No such file or directory
ERROR: language "plperl" does not exist
ERROR: could not access file "$libdir/plperl": No such file or directory
ERROR: language "plperlu" does not exist
Then it turned out that plperl was not installed by default. Installing it seems to have fixed the issue (so far I have no error messages, but it's spent the last 40 minutes presumably loading the database… well at least my processor is near 100% and the I'm getting reassuring messages in the terminal (lots of "SET" and "ALTER TABLE")).
So for other users of Ubuntu (nb: I'm using 11.04), here's the steps. First you need to add the repository with the 9.0 versions of PostgreSQL. pitti is Martin Pitt, the maintainer of PostgreSQL for Ubuntu.
Use psql to set postgres user password, create database, create roles (nb: I didn't do all of these at once, and I used the pgadmin3 gui for adding roles, so YMMV):
sudo -u postgres psql postgres
> \password postgres
# enter password when prompted
> create database plx;
> create role apache;
> create role reader;
> create role pool;
Finally, restore the database:
sudo add-apt-repository ppa:pitti/postgresql
Hopefully that helps!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for persisting and documenting what you found to work.
Once the database is operational, the best method I've found for inspecting the stored procedures is in psql to use \H to change to HTML output, \pset pager to turn page pauses off, and \df+ to output an HTML dump of the procedures, and then to copy that and paste it into a file for display in a web browser.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The restore process finally completed, and near the end there were a few more errors:
ALTER TABLE
REVOKE
REVOKE
GRANT
GRANT
ERROR: role "smc" does not exist
REVOKE
REVOKE
GRANT
GRANT
GRANT
ERROR: role "smc" does not exist
GRANT
ERROR: role "evans" does not exist
Can these be safely ignored? Or do I need to add those roles and re-restore again?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think these other roles exist only in schemas other than "public", and I thought I had provided only the "public" schema. Those roles should have no bearing on the normal operation of the database, and I'm guessing you can ignore these error messages.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Before giving you access to the PanLex server and data, I'd like to know what kind(s) of access you wish to have. Among the options I can think of are:
(1) A dump of the database (PostgreSQL 9.1).
(2) Tab-delimited copies of specific tables.
(3) The right to connect remotely to the database on the server.
(4) The same Web and API access that the general public has.
How big is the dump of the database?
The latest dump (before compression) is 824 MB.
I would prefer a compressed dump of the sql db. At some point it would be nice to also have a web service/ API for quick online access for applications.
I'll prepare a suitable dump for download by interns wanting it. It should be ready during the weekend.
As for APIs, is there a style (protocol) that seems best? See section 6 of “Developing an API for PanLex”.
@reganto, glad you asked that last question, since I'll be doing some work on the API as part of the work on PanImages.
I would prefer API and remote connection access, though a dump may also be useful.
We doubled the transmission speed to and from the PanLex server this weekend. Of course, this resulted in (or coincided with) failures that have so far eluded diagnosis. Troubleshooting continues. Until the problem is solved, access to the PanLex and Utilika websites may be erratic, and preparation of a dump of the database will be delayed. Thanks for your understanding.
Sorry for the delay. There is now a compressed dump 713 MB in size that takes about an hour to download. Let me know if you would like it.
(post moved from other thread)
I got the plx.sql dump last week, and I'm having some trouble restoring it into a PostgreSQL database. I'm not terribly familiar with PostgreSQL, and I've been following the instructions at https://help.ubuntu.com/community/PostgreSQL for installing it to my Ubuntu machine.
When I tried to do the following to restore the dump to a database called plx, I got a bunch of errors:
sudo -u postgres psql plx < plx.sql
The errors were mostly:
CREATE TABLE
ERROR: role "apache" does not exist
CREATE VIEW
ERROR: role "pool" does not exist
So I used pgadmin3 to add a login role "apache", and tried again. This time I many more errors:
CREATE SCHEMA
ERROR: role "pool" does not exist
ERROR: syntax error at or near "PROCEDURAL"
LINE 1: CREATE OR REPLACE PROCEDURAL LANGUAGE plperl;
ERROR: language "plpgsql" does not exist
ERROR: function apad(text, text, text, text, text, text, integer, integer, integer, text, text, text, text, text, text, integer) does not exist
Do you have any suggestions for restoring the dump to a database?
Sorry for the delayed reply.
The PostgreSQL 9.0.1 documentation in section 24.1.1 says that the prerequisites to restoration of a plx dump include:
1. createdb -T template0 plx
2. all the users who own objects or were granted permissions on objects in the dumped database must already exist.
So I think it wasn't enough to give you a dump file. I also needed to give you a list of roles with any ownership or permissions in plx.
In principle, the only such roles should be "apache" and "reader", but I see I have probably included a couple of tables with "pool" as the owner, so I think you'd need to create those 3 roles before restoring. The fact that "pool" is demanded for creation of a schema surprises me, because I thought I included only the "public" schema in the dump, and its owner is "postgres".
The error with "CREATE OR REPLACE PROCEDURAL LANGUAGE" is also a surprise. That syntax is OK per the documentation. The languages installed in plx are:
createlang -l plx
Procedural Languages
Name | Trusted?
------+-------
plperl | yes
plperlu | no
plpgsql | yes
If the system responds that "plpgsql" does not exist, I presume that means it isn't installed. The documentation says "In PostgreSQL 9.0 and later, PL/pgSQL is installed by default."
I don't understand why it would be reported that the specified "apad" function does not exist. In plx there are 2 "apad" functions, and one has the specified signature:
\df apad
List of functions
Schema | Name | Result data type | Argument data types | Type
-----+----+------------+------------------------------------------------------------------------+-----
public | apad | integer | text, text, text, text, text, text, integer, integer, integer, text, text, text, text, text, integer | normal
public | apad | integer | text, text, text, text, text, text, integer, integer, integer, text, text, text, text, text, text, integer | normal
(2 rows)
From the above, can you eliminate any of these errors? If so, what errors remain?
Thanks for the response. I added the users, and upgraded to PostgreSQL 9.1, since the default version in Ubuntu's repositories was 8.4. These two things seemed to help a bit, but I still get some errors, which seem to all be related to the language issue:
ERROR: could not access file "$libdir/plperl": No such file or directory
ERROR: language "plperl" does not exist
ERROR: could not access file "$libdir/plperl": No such file or directory
ERROR: language "plperlu" does not exist
Then it turned out that plperl was not installed by default. Installing it seems to have fixed the issue (so far I have no error messages, but it's spent the last 40 minutes presumably loading the database… well at least my processor is near 100% and the I'm getting reassuring messages in the terminal (lots of "SET" and "ALTER TABLE")).
So for other users of Ubuntu (nb: I'm using 11.04), here's the steps. First you need to add the repository with the 9.0 versions of PostgreSQL. pitti is Martin Pitt, the maintainer of PostgreSQL for Ubuntu.
And add the plperl package:
Use psql to set postgres user password, create database, create roles (nb: I didn't do all of these at once, and I used the pgadmin3 gui for adding roles, so YMMV):
Finally, restore the database:
Hopefully that helps!
Sorry that last command for restoring the database should be:
Thanks for persisting and documenting what you found to work.
Once the database is operational, the best method I've found for inspecting the stored procedures is in psql to use \H to change to HTML output, \pset pager to turn page pauses off, and \df+ to output an HTML dump of the procedures, and then to copy that and paste it into a file for display in a web browser.
Thanks for the tip.
The restore process finally completed, and near the end there were a few more errors:
ALTER TABLE
REVOKE
REVOKE
GRANT
GRANT
ERROR: role "smc" does not exist
REVOKE
REVOKE
GRANT
GRANT
GRANT
ERROR: role "smc" does not exist
GRANT
ERROR: role "evans" does not exist
Can these be safely ignored? Or do I need to add those roles and re-restore again?
I think these other roles exist only in schemas other than "public", and I thought I had provided only the "public" schema. Those roles should have no bearing on the normal operation of the database, and I'm guessing you can ignore these error messages.