Thread: [Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

Brought to you by: bckfnn, bwarsaw, bzimmer, cgroves, and 4 others

jython-bugs

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 14:24:32

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by leouserz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 19:10:30

Bugs item #1152612, was opened at 2005-02-26 17:17
Message generated for change (Comment added) made by kzuberi
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

>Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 14:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 09:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 19:30:26

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by leouserz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 19:36:40

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by pedronis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

>Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 19:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 19:54:41

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by leouserz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:54

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmm, I was thinking about having a PyString cache.  Instead of calling new
PyString("STRING OF SOMETHING") pass the call off to a factory method and
have it return a cached PyString.  I was under the impression yesterday
that PyStringMap was getting PyStrings anyway and that they were passing on
their Strings.  So Im not sure how switching to PyDictionary is going to
add any costs in this regard.

Yes, hashCode should be faster than System.identityHashCode().  Native
methods add overhead that you won't ever see with a simple accessor method.
 String just returns a newly calculated hashCode or a cached one.

The performance difference I saw yesterday may not even be centered around
the difference between identityHashCode and hashCode, it may just be that
the HashMap is more efficient in how it stores and retrieves things than
PyStringMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 19:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 20:38:34

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by leouserz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:38

Message:
Logged In: YES 
user_id=1277399
Originator: NO

yup, I did some fiddling with PyJavaClass so that it used a PyDictionary
instead of a PyStringMap.   Performance wise, it improved but not to the
degree that it improved with PyStringMap.  Even having the Strings interned
in the PyDictionary did not give us as big a boost as PyStringMap did. 
This may just mean that PyDictionary could use some additional tweaking. 
Swapping in a HashMap will help a little as there will be less lock
acquisition going on.  But I can't believe that is the key to the better
performance I was seeing.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:54

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmm, I was thinking about having a PyString cache.  Instead of calling new
PyString("STRING OF SOMETHING") pass the call off to a factory method and
have it return a cached PyString.  I was under the impression yesterday
that PyStringMap was getting PyStrings anyway and that they were passing on
their Strings.  So Im not sure how switching to PyDictionary is going to
add any costs in this regard.

Yes, hashCode should be faster than System.identityHashCode().  Native
methods add overhead that you won't ever see with a simple accessor method.
 String just returns a newly calculated hashCode or a cached one.

The performance difference I saw yesterday may not even be centered around
the difference between identityHashCode and hashCode, it may just be that
the HashMap is more efficient in how it stores and retrieves things than
PyStringMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 19:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 20:48:22

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by leouserz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:48

Message:
Logged In: YES 
user_id=1277399
Originator: NO

aha, PyStringMap does have a magic method,
__finditem__(String data)

this gets invoked first, and if we go directly to the table in
PyDictionary we see a pretty good boost in performance there.  I guess the
default __finditem__ method of PyStringMap is less performant than
PyDictionary's __finditem__ chain.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:38

Message:
Logged In: YES 
user_id=1277399
Originator: NO

yup, I did some fiddling with PyJavaClass so that it used a PyDictionary
instead of a PyStringMap.   Performance wise, it improved but not to the
degree that it improved with PyStringMap.  Even having the Strings interned
in the PyDictionary did not give us as big a boost as PyStringMap did. 
This may just mean that PyDictionary could use some additional tweaking. 
Swapping in a HashMap will help a little as there will be less lock
acquisition going on.  But I can't believe that is the key to the better
performance I was seeing.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:54

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmm, I was thinking about having a PyString cache.  Instead of calling new
PyString("STRING OF SOMETHING") pass the call off to a factory method and
have it return a cached PyString.  I was under the impression yesterday
that PyStringMap was getting PyStrings anyway and that they were passing on
their Strings.  So Im not sure how switching to PyDictionary is going to
add any costs in this regard.

Yes, hashCode should be faster than System.identityHashCode().  Native
methods add overhead that you won't ever see with a simple accessor method.
 String just returns a newly calculated hashCode or a cached one.

The performance difference I saw yesterday may not even be centered around
the difference between identityHashCode and hashCode, it may just be that
the HashMap is more efficient in how it stores and retrieves things than
PyStringMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 19:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 20:58:25

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by pedronis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

>Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 20:58

Message:
Logged In: YES 
user_id=61408
Originator: NO

notice that we want the synchronization. Although it makes no strong
promises about what happens
with implementations withouh the GIL, Python style is influenced by the
presence of the GIL in CPython
this means that builtin types should have an "atomic" behavior.

Now the are dissenting opionions
(http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm)
on this but the bottom line (because it has happened) is that if we
miss some synchronized related to some of the listed ops, someone will
ends up filing a bug because some code is not behaving like on CPython. We
had this kind of reports from experienced Pythoneers (for example some
twisted contributors), and telling them to add more locks themself doesn't
really work or scale in practice, because is too
annoying especially if the code is to run on top CPython primarely.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:48

Message:
Logged In: YES 
user_id=1277399
Originator: NO

aha, PyStringMap does have a magic method,
__finditem__(String data)

this gets invoked first, and if we go directly to the table in
PyDictionary we see a pretty good boost in performance there.  I guess the
default __finditem__ method of PyStringMap is less performant than
PyDictionary's __finditem__ chain.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:38

Message:
Logged In: YES 
user_id=1277399
Originator: NO

yup, I did some fiddling with PyJavaClass so that it used a PyDictionary
instead of a PyStringMap.   Performance wise, it improved but not to the
degree that it improved with PyStringMap.  Even having the Strings interned
in the PyDictionary did not give us as big a boost as PyStringMap did. 
This may just mean that PyDictionary could use some additional tweaking. 
Swapping in a HashMap will help a little as there will be less lock
acquisition going on.  But I can't believe that is the key to the better
performance I was seeing.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:54

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmm, I was thinking about having a PyString cache.  Instead of calling new
PyString("STRING OF SOMETHING") pass the call off to a factory method and
have it return a cached PyString.  I was under the impression yesterday
that PyStringMap was getting PyStrings anyway and that they were passing on
their Strings.  So Im not sure how switching to PyDictionary is going to
add any costs in this regard.

Yes, hashCode should be faster than System.identityHashCode().  Native
methods add overhead that you won't ever see with a simple accessor method.
 String just returns a newly calculated hashCode or a cached one.

The performance difference I saw yesterday may not even be centered around
the difference between identityHashCode and hashCode, it may just be that
the HashMap is more efficient in how it stores and retrieves things than
PyStringMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 19:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 21:16:05

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by leouserz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 21:16

Message:
Logged In: YES 
user_id=1277399
Originator: NO

I think we may be able to get away with a ConcurrentHashMap.  It should
offer better scalability than the one lock per table that comes with the
Hashtable and is safer than the HashMap which doesn't have any
syncronization.  Im seeing roughly the same golden times I was seeing
yesterday with a PyDictionary that has had its __finditem__(String)
overriden and its Hashtable replaced with a ConcurrentHashMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 20:58

Message:
Logged In: YES 
user_id=61408
Originator: NO

notice that we want the synchronization. Although it makes no strong
promises about what happens
with implementations withouh the GIL, Python style is influenced by the
presence of the GIL in CPython
this means that builtin types should have an "atomic" behavior.

Now the are dissenting opionions
(http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm)
on this but the bottom line (because it has happened) is that if we
miss some synchronized related to some of the listed ops, someone will
ends up filing a bug because some code is not behaving like on CPython. We
had this kind of reports from experienced Pythoneers (for example some
twisted contributors), and telling them to add more locks themself doesn't
really work or scale in practice, because is too
annoying especially if the code is to run on top CPython primarely.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:48

Message:
Logged In: YES 
user_id=1277399
Originator: NO

aha, PyStringMap does have a magic method,
__finditem__(String data)

this gets invoked first, and if we go directly to the table in
PyDictionary we see a pretty good boost in performance there.  I guess the
default __finditem__ method of PyStringMap is less performant than
PyDictionary's __finditem__ chain.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:38

Message:
Logged In: YES 
user_id=1277399
Originator: NO

yup, I did some fiddling with PyJavaClass so that it used a PyDictionary
instead of a PyStringMap.   Performance wise, it improved but not to the
degree that it improved with PyStringMap.  Even having the Strings interned
in the PyDictionary did not give us as big a boost as PyStringMap did. 
This may just mean that PyDictionary could use some additional tweaking. 
Swapping in a HashMap will help a little as there will be less lock
acquisition going on.  But I can't believe that is the key to the better
performance I was seeing.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:54

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmm, I was thinking about having a PyString cache.  Instead of calling new
PyString("STRING OF SOMETHING") pass the call off to a factory method and
have it return a cached PyString.  I was under the impression yesterday
that PyStringMap was getting PyStrings anyway and that they were passing on
their Strings.  So Im not sure how switching to PyDictionary is going to
add any costs in this regard.

Yes, hashCode should be faster than System.identityHashCode().  Native
methods add overhead that you won't ever see with a simple accessor method.
 String just returns a newly calculated hashCode or a cached one.

The performance difference I saw yesterday may not even be centered around
the difference between identityHashCode and hashCode, it may just be that
the HashMap is more efficient in how it stores and retrieves things than
PyStringMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 19:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2006-12-22 21:36:29

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by leouserz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 21:36

Message:
Logged In: YES 
user_id=1277399
Originator: NO

it may be more scalable with threads but it is not more scalable with
memory.  ConcurrentHashMap appears to be a hog in terms of what it
consumes.  That nixes is for general mass usage.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 21:16

Message:
Logged In: YES 
user_id=1277399
Originator: NO

I think we may be able to get away with a ConcurrentHashMap.  It should
offer better scalability than the one lock per table that comes with the
Hashtable and is safer than the HashMap which doesn't have any
syncronization.  Im seeing roughly the same golden times I was seeing
yesterday with a PyDictionary that has had its __finditem__(String)
overriden and its Hashtable replaced with a ConcurrentHashMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 20:58

Message:
Logged In: YES 
user_id=61408
Originator: NO

notice that we want the synchronization. Although it makes no strong
promises about what happens
with implementations withouh the GIL, Python style is influenced by the
presence of the GIL in CPython
this means that builtin types should have an "atomic" behavior.

Now the are dissenting opionions
(http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm)
on this but the bottom line (because it has happened) is that if we
miss some synchronized related to some of the listed ops, someone will
ends up filing a bug because some code is not behaving like on CPython. We
had this kind of reports from experienced Pythoneers (for example some
twisted contributors), and telling them to add more locks themself doesn't
really work or scale in practice, because is too
annoying especially if the code is to run on top CPython primarely.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:48

Message:
Logged In: YES 
user_id=1277399
Originator: NO

aha, PyStringMap does have a magic method,
__finditem__(String data)

this gets invoked first, and if we go directly to the table in
PyDictionary we see a pretty good boost in performance there.  I guess the
default __finditem__ method of PyStringMap is less performant than
PyDictionary's __finditem__ chain.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:38

Message:
Logged In: YES 
user_id=1277399
Originator: NO

yup, I did some fiddling with PyJavaClass so that it used a PyDictionary
instead of a PyStringMap.   Performance wise, it improved but not to the
degree that it improved with PyStringMap.  Even having the Strings interned
in the PyDictionary did not give us as big a boost as PyStringMap did. 
This may just mean that PyDictionary could use some additional tweaking. 
Swapping in a HashMap will help a little as there will be less lock
acquisition going on.  But I can't believe that is the key to the better
performance I was seeing.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:54

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmm, I was thinking about having a PyString cache.  Instead of calling new
PyString("STRING OF SOMETHING") pass the call off to a factory method and
have it return a cached PyString.  I was under the impression yesterday
that PyStringMap was getting PyStrings anyway and that they were passing on
their Strings.  So Im not sure how switching to PyDictionary is going to
add any costs in this regard.

Yes, hashCode should be faster than System.identityHashCode().  Native
methods add overhead that you won't ever see with a simple accessor method.
 String just returns a newly calculated hashCode or a cached one.

The performance difference I saw yesterday may not even be centered around
the difference between identityHashCode and hashCode, it may just be that
the HashMap is more efficient in how it stores and retrieves things than
PyStringMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 19:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2007-05-19 18:24:34

Bugs item #1152612, was opened at 2005-02-26 22:17
Message generated for change (Comment added) made by amak
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

>Comment By: Alan Kennedy (amak)
Date: 2007-05-19 18:24

Message:
Logged In: YES 
user_id=647684
Originator: NO

Could solving this problem be as simple as changing the
org/python/modules/types.java to read like this

dict.__setitem__("DictType", new PyTuple(new PyObject[] {
    PyType.fromClass(PyDictionary.class)),
    PyType.fromClass(PyStringMap.class)),
}));

The isinstance operator takes a tuple as a parameter, as can be seen in
the definition for StringTypes.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 21:36

Message:
Logged In: YES 
user_id=1277399
Originator: NO

it may be more scalable with threads but it is not more scalable with
memory.  ConcurrentHashMap appears to be a hog in terms of what it
consumes.  That nixes is for general mass usage.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 21:16

Message:
Logged In: YES 
user_id=1277399
Originator: NO

I think we may be able to get away with a ConcurrentHashMap.  It should
offer better scalability than the one lock per table that comes with the
Hashtable and is safer than the HashMap which doesn't have any
syncronization.  Im seeing roughly the same golden times I was seeing
yesterday with a PyDictionary that has had its __finditem__(String)
overriden and its Hashtable replaced with a ConcurrentHashMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 20:58

Message:
Logged In: YES 
user_id=61408
Originator: NO

notice that we want the synchronization. Although it makes no strong
promises about what happens
with implementations withouh the GIL, Python style is influenced by the
presence of the GIL in CPython
this means that builtin types should have an "atomic" behavior.

Now the are dissenting opionions
(http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm)
on this but the bottom line (because it has happened) is that if we
miss some synchronized related to some of the listed ops, someone will
ends up filing a bug because some code is not behaving like on CPython. We
had this kind of reports from experienced Pythoneers (for example some
twisted contributors), and telling them to add more locks themself doesn't
really work or scale in practice, because is too
annoying especially if the code is to run on top CPython primarely.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:48

Message:
Logged In: YES 
user_id=1277399
Originator: NO

aha, PyStringMap does have a magic method,
__finditem__(String data)

this gets invoked first, and if we go directly to the table in
PyDictionary we see a pretty good boost in performance there.  I guess the
default __finditem__ method of PyStringMap is less performant than
PyDictionary's __finditem__ chain.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 20:38

Message:
Logged In: YES 
user_id=1277399
Originator: NO

yup, I did some fiddling with PyJavaClass so that it used a PyDictionary
instead of a PyStringMap.   Performance wise, it improved but not to the
degree that it improved with PyStringMap.  Even having the Strings interned
in the PyDictionary did not give us as big a boost as PyStringMap did. 
This may just mean that PyDictionary could use some additional tweaking. 
Swapping in a HashMap will help a little as there will be less lock
acquisition going on.  But I can't believe that is the key to the better
performance I was seeing.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:54

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmm, I was thinking about having a PyString cache.  Instead of calling new
PyString("STRING OF SOMETHING") pass the call off to a factory method and
have it return a cached PyString.  I was under the impression yesterday
that PyStringMap was getting PyStrings anyway and that they were passing on
their Strings.  So Im not sure how switching to PyDictionary is going to
add any costs in this regard.

Yes, hashCode should be faster than System.identityHashCode().  Native
methods add overhead that you won't ever see with a simple accessor method.
 String just returns a newly calculated hashCode or a cached one.

The performance difference I saw yesterday may not even be centered around
the difference between identityHashCode and hashCode, it may just be that
the HashMap is more efficient in how it stores and retrieves things than
PyStringMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 19:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 19:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 19:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

[Jython-bugs] [ jython-Bugs-1152612 ] vars(obj) returns PyStringMap instead of DictType

From: SourceForge.net <no...@so...> - 2007-05-19 19:56:11

Bugs item #1152612, was opened at 2005-02-26 17:17
Message generated for change (Comment added) made by fwierzbicki
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: Deferred
Status: Open
Resolution: None
Priority: 2
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: vars(obj) returns PyStringMap instead of DictType

Initial Comment:
When getting an object's __dict__, the type() of the  
dictionary object returns PyStringMap. This causes a  
problem because types.DictType does not match  
PyStringMap. Some existing Marshallers (in my case,  
xmlrpclib) expect an Instance's __dict__ to be a DictType  
when marshalling an Instance (such as an Exception).  
  
It looks like types.DictType should match  
org.python.core.PyStringMap. When getting the __dict__  
of an Instance in CPython, it returns a type of DictType.  
  
-Steve 
leo...@nu... 

----------------------------------------------------------------------

>Comment By: Frank Wierzbicki (fwierzbicki)
Date: 2007-05-19 15:56

Message:
Logged In: YES 
user_id=193969
Originator: NO

I don't think making isinstance of PyStringMap return DictType makes
sense, since it does not implement the full interface of a DictType, for
example, it cannot take non-strings as keys while one would expect
something of type DictType to behave that way.

----------------------------------------------------------------------

Comment By: Alan Kennedy (amak)
Date: 2007-05-19 14:24

Message:
Logged In: YES 
user_id=647684
Originator: NO

Could solving this problem be as simple as changing the
org/python/modules/types.java to read like this

dict.__setitem__("DictType", new PyTuple(new PyObject[] {
    PyType.fromClass(PyDictionary.class)),
    PyType.fromClass(PyStringMap.class)),
}));

The isinstance operator takes a tuple as a parameter, as can be seen in
the definition for StringTypes.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 16:36

Message:
Logged In: YES 
user_id=1277399
Originator: NO

it may be more scalable with threads but it is not more scalable with
memory.  ConcurrentHashMap appears to be a hog in terms of what it
consumes.  That nixes is for general mass usage.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 16:16

Message:
Logged In: YES 
user_id=1277399
Originator: NO

I think we may be able to get away with a ConcurrentHashMap.  It should
offer better scalability than the one lock per table that comes with the
Hashtable and is safer than the HashMap which doesn't have any
syncronization.  Im seeing roughly the same golden times I was seeing
yesterday with a PyDictionary that has had its __finditem__(String)
overriden and its Hashtable replaced with a ConcurrentHashMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 15:58

Message:
Logged In: YES 
user_id=61408
Originator: NO

notice that we want the synchronization. Although it makes no strong
promises about what happens
with implementations withouh the GIL, Python style is influenced by the
presence of the GIL in CPython
this means that builtin types should have an "atomic" behavior.

Now the are dissenting opionions
(http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm)
on this but the bottom line (because it has happened) is that if we
miss some synchronized related to some of the listed ops, someone will
ends up filing a bug because some code is not behaving like on CPython. We
had this kind of reports from experienced Pythoneers (for example some
twisted contributors), and telling them to add more locks themself doesn't
really work or scale in practice, because is too
annoying especially if the code is to run on top CPython primarely.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 15:48

Message:
Logged In: YES 
user_id=1277399
Originator: NO

aha, PyStringMap does have a magic method,
__finditem__(String data)

this gets invoked first, and if we go directly to the table in
PyDictionary we see a pretty good boost in performance there.  I guess the
default __finditem__ method of PyStringMap is less performant than
PyDictionary's __finditem__ chain.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 15:38

Message:
Logged In: YES 
user_id=1277399
Originator: NO

yup, I did some fiddling with PyJavaClass so that it used a PyDictionary
instead of a PyStringMap.   Performance wise, it improved but not to the
degree that it improved with PyStringMap.  Even having the Strings interned
in the PyDictionary did not give us as big a boost as PyStringMap did. 
This may just mean that PyDictionary could use some additional tweaking. 
Swapping in a HashMap will help a little as there will be less lock
acquisition going on.  But I can't believe that is the key to the better
performance I was seeing.

leouser

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:54

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmm, I was thinking about having a PyString cache.  Instead of calling new
PyString("STRING OF SOMETHING") pass the call off to a factory method and
have it return a cached PyString.  I was under the impression yesterday
that PyStringMap was getting PyStrings anyway and that they were passing on
their Strings.  So Im not sure how switching to PyDictionary is going to
add any costs in this regard.

Yes, hashCode should be faster than System.identityHashCode().  Native
methods add overhead that you won't ever see with a simple accessor method.
 String just returns a newly calculated hashCode or a cached one.

The performance difference I saw yesterday may not even be centered around
the difference between identityHashCode and hashCode, it may just be that
the HashMap is more efficient in how it stores and retrieves things than
PyStringMap.

leouser

----------------------------------------------------------------------

Comment By: Samuele Pedroni (pedronis)
Date: 2006-12-22 14:36

Message:
Logged In: YES 
user_id=61408
Originator: NO

the issue is all the places that have an already interned String, not a
PyString. String to PyString involve an allocation. Allocations are still
costly.

Whether using hashCode vs. identityHashCode, it is well possible that the
performace trade offs of the two have changed over time since 1.1.
Implementing identityHashCode is not straightforward on moving gcs.

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 14:30

Message:
Logged In: YES 
user_id=1277399
Originator: NO

hmmm, speed wise Im not sure, I guess it depends upon how quickly the
PyString is going to return a hashCode call.  From gutting PyStringMap and
replacing it with a Map that used the interned strings I saw a boost in
performance on the test I was running.  So from that angle PyStringMap
didn't seem that speedy. 

I would suspect that PyString would return as quickly as the String.  Its
hashCode, hashes the internal string and caches the value.  So I would
expect equivilent behavior between the two.  Also, it looks like it should
have a speedy equals method.  As long as the string is interned.  So I
don't see any terrible issues using it as a key.

I think using a PyDictionary would make the instances more compliant with
Python.  Given that I can take the dict from a Python instance and use
non-strings for keys.

leouser

----------------------------------------------------------------------

Comment By: Khalid Zuberi (kzuberi)
Date: 2006-12-22 14:10

Message:
Logged In: YES 
user_id=18288
Originator: NO

The only (little) help i can add is to note Samuel's recent reference to
performance & PyStringMap:

  http://article.gmane.org/gmane.comp.lang.jython.devel/2610

- kz

----------------------------------------------------------------------

Comment By: leouser (leouserz)
Date: 2006-12-22 09:24

Message:
Logged In: YES 
user_id=1277399
Originator: NO

its possible that this could be fixed by just ditching the PyStringMap
used internally and switching over to PyDictionary.  From experimenting
with gutting PyStringMap and replacing its internal arrays and hashing with
a HashMap, I was able to get an increase in performance.  Given that the
Dictionary appears remarkable similiar to that implementation--->
forwarding to its Hashtable(yuck), there may not be a performance reason to
stick with the PyStringMap(assuming that is the reason that there is a
PyStringMap).

leouser

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1152612&group_id=12867