The search button appears to be defaulted to: search for text in: Persons, Families, Sources, Shared Notes and in all GEDCOM's. This may take now and then a long time (many minutes) in case of GEDCOM's amounting exceeding e.g. 20Mb.
There is no (clear) signal that the system is busy, so the user might end in despair. Several times in my situation I observe 5-15 minutes waiting for response [or longer, where I decide to cancel.
A better default seems to me to have as default to search only for persons in the current GEDCOM giving always a fast reply. The other search function offers the choice in several types of searching, where users can select the type & scope of searching.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My gedcom is 15MB, and the "quick search" button responds in a few seconds.
I'm aware that "family search" query part of this runs especially slowly on certain versions of MySQL. e.g. an identical query works fine on my live server (MySQL4.1), but hangs on my development server (MySQL 5.1).
It seems to be ignoring the indexes and generating huge temporary results sets. I haven't had the time to investigate, but I hope to do so before the next release.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Paul
Why would the system artificially limit a general search? And what kind of timeout do you have set in PHP?
The advanced search is constructed to allow someone to filter searches by type, gedcom and more, which seems appropriate to me. The general search in the header should be exactly that, a site search for the terms you seek.
If your search takes more than 1.5 minutes, you more likely have a collation or hardware issue. We host 4 Gedcoms, 42mb, 10mb, 6mb and 3mb and general searches return results quite promptly.
Stephen
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I understand now from okbigkid that the search, in the right upper corner, has been designed for text search through all GEDCOMs. I was not aware of this. So let ir be.
From the reply of fisharebest, I may conclude that I run into some MYSQL version trouble, resulting in extreme slow responses.
So what should I do:
Negociate with my service provider for an investigation?
Or is this something that first needs inspection by PhpGedView development?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a PGV installation that is slow on searches with large result sets. There are several thousand individuals with one surname. Sometimes it seems that the browser gives up before the PGV does, other times there's a PHP error, likely a timeout. Won't work at all with <100M memory limit.
Why don't large result sets return by the pageful (eg 25 or 100per page), like Google and many other sites (even this site, Sourceforge, does this with searches). This would solve the problem even for low memory, short timeout, and non-optimal database situations.
If there's agreement that a patch doing this might be accepted I'd be willing to work on it. It seems something this high-profile could use more than one person on it…
George
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2010-02-10
George
I'm absolutely sure everyone would be happy for any help with development.
First you need to make sure you are using the latest version (the SVN Trunk), and don't forget that is being updated at least daily. No good developing a Patch for a released version, as development has already moved well past that point. There's a number of us that run either live sites or test ones, using SVN so that all improvements can be tested as soon as possible.
Your idea sounds interesting, though I confess I'm a little skeptical. But please don't let that put you off :-)
Remember, searches here, and on Google don't have to consider the very complex privacy options that PGV is required to follow - all of which are customisable, so every site has different requirements. Thats often a key factor in slow speeds.
Nevertheless, fisharebest is the resident expert in this field, and he did say further up here
It's something that the PGV developers need to look it. It's on my list of things to do.
But paying work has to take priority over PGV development….
Nigel
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2010-02-10
… just a couple of other thoughts…
Not sure how far you have thought through your idea yet, but as I imagine a "paged" set of results, presumably we would lose the current ability to sort the results after display, by date of birth, place, death, age etc? I find that a very valuable tool.
I just did a general search for the most common surname on my site, and it gave results in about 11 secs for 300 individuals, 77 families (plus many more hidden due to privacy), 100 source records - all across three GEDCOMs. So, given a well maintained DB, how much of a problem is there really? I suspect when fisharebest says it needs looking at, he's referring to this comment:
My gedcom is 15MB, and the "quick search" button responds in a few seconds.
I'm aware that "family search" query part of this runs especially slowly on certain versions of MySQL. e.g. an identical query works fine on my live server (MySQL4.1), but hangs on my development server (MySQL 5.1).
It seems to be ignoring the indexes and generating huge temporary results sets. I haven't had the time to investigate, but I hope to do so before the next release.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a PGV installation that is slow on searches with large result sets. There are several thousand individuals with one surname.
I have similar situation. What about returning responsibility to the user starting the request?
Like: let PGV returns a signal like: 500 (e.g.) results, stil 90% to search; and that with a possibility to cancel.
In 99% (?) of such cases people will be happy that they get the opportunity to refine their request, become aware of the impact and the nonsense (likely) of scrolling through many hundreds of results.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Problem is that the db has to be searched to discover that there are "too many" results.
Is the slowness caused by the database query, or by the browser working out the client side Javascript sorting routines? Do you get similar or dissimilar experiences with different browsers … IE8, FireFox 3.6, Safari, Chrome?
Mark
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
pseesink - if you want to identify the bottleneck, you can easily enable SQL logging.
In your includes/session.php file, near the beginning, is a line that sets PGV_DEBUG_SQL to false. If you set this to true, each page will display a list of all the SQL statements executed, along with the time taken to execute it (in milli-seconds). You can mouse-over the place-holders to see what parameters were passed to the query, and mouse-over the statement number to see where in the code the SQL was called.
To enable this debug selectively (e.g. on a live server), you can make it conditional on your IP address. i.e. look up your own IP address, and set the PGV_DEBUG_SQL value to an expression such as
$_SERVER == '124.45.67.89'
This will enable the debugging on all requests coming from this IP address.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Might depend on where ip address being correct … 192.168.1.71 … looks like a NAT address, so it won't work if the MySQL server isn't on that network/LAN.
$_SERVER['REMOTE_ADDR'] == '124.45.67.89'
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think the forum is treating left/right square brackets as tags. Try this (using backslash to escape them)
$_SERVER\['REMOTE_ADDR'\]=='12.34.56.78'
Maybe your IP address is wrong. If you are accessing the site via a domain name (e.g www.mysite.com), and this resolves back to a machine on your home network, then ask your router for your "external" ip address, not your "internal" one.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
From the repetition in the last post, I'd guess that the forum software is treating the backslash as a regex-backreference, rather than as an escape character.
Perhaps doing a screen dump, and attaching an image is the only way…..
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If there's agreement that a patch doing this might be accepted I'd be willing to work on it. It seems something this high-profile could use more than one person on it…
This was poorly written and hence misunderstood. I meant that I would need help. Sorry for the confusion.
pseesink wrote:
I have similar situation. What about returning responsibility to the user starting the request?
Like: let PGV returns a signal like: 500 (e.g.) results, stil 90% to search; and that with a possibility to cancel.
There are a lot of competing concerns here - the idea of sorting the results on the client, difficulty of handling privacy controls and others. I'd like to note that the Individual List feature handles thousands with the same surname very elegantly in its context.
Warning the user is a good idea, but it fails to protect the system from various inelegant failures (if the user is patient or inattentive). Paged results address this issue.
kiwi_pgv wrote:
Not sure how far you have thought through your idea yet, but as I imagine a "paged" set of results, presumably we would lose the current ability to sort the results after display, by date of birth, place, death, age etc? I find that a very valuable tool.
Yes, this is an issue that requires some thinking through. To make sorting work with paged results the sort has to be performed at the beginning of the query, that is, on the server, not the client. I don;t think sorting makes much sense on a partial results. However, if all the results fit on one page, then sorting (that is, re-sorting on a different key) would still make sense.
A related issue is the advanced searches. At least some of the applications that sorting helps with could be addressed by date range searches (birth dates, death dates, last change dates. etc). Searching on ranges of given names is already handled well by the Individual Lists feature.
George
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The search button appears to be defaulted to: search for text in: Persons, Families, Sources, Shared Notes and in all GEDCOM's. This may take now and then a long time (many minutes) in case of GEDCOM's amounting exceeding e.g. 20Mb.
There is no (clear) signal that the system is busy, so the user might end in despair. Several times in my situation I observe 5-15 minutes waiting for response [or longer, where I decide to cancel.
A better default seems to me to have as default to search only for persons in the current GEDCOM giving always a fast reply. The other search function offers the choice in several types of searching, where users can select the type & scope of searching.
My gedcom is 15MB, and the "quick search" button responds in a few seconds.
I'm aware that "family search" query part of this runs especially slowly on certain versions of MySQL. e.g. an identical query works fine on my live server (MySQL4.1), but hangs on my development server (MySQL 5.1).
It seems to be ignoring the indexes and generating huge temporary results sets. I haven't had the time to investigate, but I hope to do so before the next release.
Paul
Why would the system artificially limit a general search? And what kind of timeout do you have set in PHP?
The advanced search is constructed to allow someone to filter searches by type, gedcom and more, which seems appropriate to me. The general search in the header should be exactly that, a site search for the terms you seek.
If your search takes more than 1.5 minutes, you more likely have a collation or hardware issue. We host 4 Gedcoms, 42mb, 10mb, 6mb and 3mb and general searches return results quite promptly.
Stephen
I understand now from okbigkid that the search, in the right upper corner, has been designed for text search through all GEDCOMs. I was not aware of this. So let ir be.
From the reply of fisharebest, I may conclude that I run into some MYSQL version trouble, resulting in extreme slow responses.
So what should I do:
Negociate with my service provider for an investigation?
Or is this something that first needs inspection by PhpGedView development?
It's something that the PGV developers need to look it. It's on my list of things to do.
But paying work has to take priority over PGV development….
Reorganizing your mysql database may help - it did solve slow load times for me.
Thanks for the suggestions to look the MYSQL database as underlying cause. I will discuss the issue further with my service provider.
A common cause of slow searching is incorrect ot mis-matched database and table collations. Try this patch file, following the instructions VERY carefully:
https://sourceforge.net/tracker/?func=detail&aid=2318005&group_id=55456&atid=477081
It generally helps.
Nigel
I have a PGV installation that is slow on searches with large result sets. There are several thousand individuals with one surname. Sometimes it seems that the browser gives up before the PGV does, other times there's a PHP error, likely a timeout. Won't work at all with <100M memory limit.
Why don't large result sets return by the pageful (eg 25 or 100per page), like Google and many other sites (even this site, Sourceforge, does this with searches). This would solve the problem even for low memory, short timeout, and non-optimal database situations.
If there's agreement that a patch doing this might be accepted I'd be willing to work on it. It seems something this high-profile could use more than one person on it…
George
George
I'm absolutely sure everyone would be happy for any help with development.
First you need to make sure you are using the latest version (the SVN Trunk), and don't forget that is being updated at least daily. No good developing a Patch for a released version, as development has already moved well past that point. There's a number of us that run either live sites or test ones, using SVN so that all improvements can be tested as soon as possible.
Your idea sounds interesting, though I confess I'm a little skeptical. But please don't let that put you off :-)
Remember, searches here, and on Google don't have to consider the very complex privacy options that PGV is required to follow - all of which are customisable, so every site has different requirements. Thats often a key factor in slow speeds.
Nevertheless, fisharebest is the resident expert in this field, and he did say further up here
But paying work has to take priority over PGV development….
Nigel
… just a couple of other thoughts…
Not sure how far you have thought through your idea yet, but as I imagine a "paged" set of results, presumably we would lose the current ability to sort the results after display, by date of birth, place, death, age etc? I find that a very valuable tool.
I just did a general search for the most common surname on my site, and it gave results in about 11 secs for 300 individuals, 77 families (plus many more hidden due to privacy), 100 source records - all across three GEDCOMs. So, given a well maintained DB, how much of a problem is there really? I suspect when fisharebest says it needs looking at, he's referring to this comment:
I'm aware that "family search" query part of this runs especially slowly on certain versions of MySQL. e.g. an identical query works fine on my live server (MySQL4.1), but hangs on my development server (MySQL 5.1).
It seems to be ignoring the indexes and generating huge temporary results sets. I haven't had the time to investigate, but I hope to do so before the next release.
I have similar situation. What about returning responsibility to the user starting the request?
Like: let PGV returns a signal like: 500 (e.g.) results, stil 90% to search; and that with a possibility to cancel.
In 99% (?) of such cases people will be happy that they get the opportunity to refine their request, become aware of the impact and the nonsense (likely) of scrolling through many hundreds of results.
Problem is that the db has to be searched to discover that there are "too many" results.
Is the slowness caused by the database query, or by the browser working out the client side Javascript sorting routines? Do you get similar or dissimilar experiences with different browsers … IE8, FireFox 3.6, Safari, Chrome?
Mark
pseesink - if you want to identify the bottleneck, you can easily enable SQL logging.
In your includes/session.php file, near the beginning, is a line that sets PGV_DEBUG_SQL to false. If you set this to true, each page will display a list of all the SQL statements executed, along with the time taken to execute it (in milli-seconds). You can mouse-over the place-holders to see what parameters were passed to the query, and mouse-over the statement number to see where in the code the SQL was called.
To enable this debug selectively (e.g. on a live server), you can make it conditional on your IP address. i.e. look up your own IP address, and set the PGV_DEBUG_SQL value to an expression such as
$_SERVER == '124.45.67.89'
This will enable the debugging on all requests coming from this IP address.
Um - the crappy forum seems to have mangled that last bit.
dollar underscore SERVER left-square-bracket single-quote REMOTE_ADDR single-quote right-square-bracket equals equals ….
So I changed:
/* define('PGV_DEBUG_SQL', false); PS 2010-02-10 */
define('PGV_DEBUG_SQL', $_SERVER == '192.168.1.71' );
?
And what can I expect now? :
Should I see displays on my PGV pages; but nothing happens ?
Should I look somewhere in a logfile?
As I said in my following post, this forum has a habit of mangling anything that looks like code.
Read where I spelt it out.
Might depend on where ip address being correct … 192.168.1.71 … looks like a NAT address, so it won't work if the MySQL server isn't on that network/LAN.
…or perhaps maybe you did, and the forum mangled your code in the same way it mangled mine.
Just set it to true.
The bottom of every page (or every section on the index.php page) will have a huge table showing all the SQL. You won't miss it.
Can it really not cope with codifying this line? DOH!
This 'improved' sf forum really does have some major problems. Whet ever happened to 'plain text' :)
The email copies actually seem to get things right ….
I think the forum is treating left/right square brackets as tags. Try this (using backslash to escape them)
Maybe your IP address is wrong. If you are accessing the site via a domain name (e.g www.mysite.com), and this resolves back to a machine on your home network, then ask your router for your "external" ip address, not your "internal" one.
stunning forum software !
From the repetition in the last post, I'd guess that the forum software is treating the backslash as a regex-backreference, rather than as an escape character.
Perhaps doing a screen dump, and attaching an image is the only way…..
ggpauly wrote:
This was poorly written and hence misunderstood. I meant that I would need help. Sorry for the confusion.
pseesink wrote:
There are a lot of competing concerns here - the idea of sorting the results on the client, difficulty of handling privacy controls and others. I'd like to note that the Individual List feature handles thousands with the same surname very elegantly in its context.
Warning the user is a good idea, but it fails to protect the system from various inelegant failures (if the user is patient or inattentive). Paged results address this issue.
kiwi_pgv wrote:
Yes, this is an issue that requires some thinking through. To make sorting work with paged results the sort has to be performed at the beginning of the query, that is, on the server, not the client. I don;t think sorting makes much sense on a partial results. However, if all the results fit on one page, then sorting (that is, re-sorting on a different key) would still make sense.
A related issue is the advanced searches. At least some of the applications that sorting helps with could be addressed by date range searches (birth dates, death dates, last change dates. etc). Searching on ranges of given names is already handled well by the Individual Lists feature.
George