Re: [Jython-dev] PyShadowString test, ideas, questions

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

CharSequence: good idea. Makes me think PyString should be the same. 
(Bigger step to take, however.)

There's a Unicode issue lurking here, as there always is with PyString 
and Java char. With PyString constructors we know that the chars in the 
arguments represent bytes by convention, and I think this should be the 
same. They would only be called from Java, where the caller ought to 
take care of the encoding. Broadening to PyObject makes it less likely 
the caller will do so, IMO.

Jeff Allen

On 30/08/2018 00:28, Stefan Richthofer wrote:
> Re 1) > the ultimate constructor guarantees stringiness
> Maybe we could have one public constructor based on CharSequence. That 
> would directly allow PyString and alike. Internally it would check for 
> instanceof PyObject, cast if possible and construct PyString from the 
> CharSequnce otherwise.
>
> Re 2) > I wondered if there is some document that explains what to 
> consider when adding a new builtin type that shall support 
> subclassing. I found this wiki page: 
> https://wiki.python.org/jython/GeneratedDerivedClasses that confirms 
> all we need to add is the check I suggested. If you know an additional 
> doc, please tell me.
>
> > Please feel free to go ahead with the changes.
> Alright. Perfect.
>
> Am So., 26. Aug. 2018 um 23:59 Uhr schrieb Jeff Allen 
> <ja...@fa... <mailto:ja...@fa...>>:
>
>     1) I don't recall the thinking at the time. There are an awful lot
>     of constructors here and I suspect I wanted to minimise. It looks
>     like the ultimate constructor guarantees stringiness, so that
>     wasn't the explanation. I'd be fine with public.
>
>     2) I think you're right about new. I'd prefer to add the for_type
>     clause than prohibit it, because sub-classing is required by the
>     tests (for it to be equivalent to str in all respects). We
>     discussed it below, I see.
>
>     Please feel free to go ahead with the changes.
>
>     Jeff Allen
>
>     On 26/08/2018 16:17, Stefan Richthofer wrote:
>>     Jeff, I encountered two more concerns with PyShadowString recently:
>>
>>     1) I would like to have the PyObject/PyObject constructor public.
>>     See https://github.com/Stewori/JyNI/pull/29. Is there an actual
>>     concern to keep it private?
>>     2) I noticed that PyShadowStringDerived is never used. In
>>     shadowstr_new some equivalent to this code (e.g. from PyString)
>>     would be needed:
>>     if (new_.for_type == subtype) {
>>          return new PyString(str);
>>     } else {
>>         return new PyStringDerived(subtype, str);
>>     }
>>     If we wouldn't add something like this there is no point to have
>>     PyShadowStringDerived at all. We should either add this or remove
>>     PyShadowStringDerived and prohibit subclassing of PyShadowString.
>>     Or do I overlook something?
>>     I can perform these changes if it's okay for you.
>>
>>     Best
>>
>>     -Stefan
>>
>>     Am Do., 30. Nov. 2017 um 08:13 Uhr schrieb Jeff Allen
>>     <ja...@fa... <mailto:ja...@fa...>>:
>>
>>         The point of having a copy of the list would be to prevent this:
>>
>>         >>> s = PyShadowString("hello", "bonjour")
>>         >>> s.addtarget(r"org\.python\.util\.InteractiveConsole")
>>         >>> t = s[:3]
>>         >>> t == "bon"
>>         True
>>         >>> s. gettargets().pop()
>>         ('org\\.python\\.util\\.InteractiveConsole',)
>>         >>> t == "bon"
>>         False
>>
>>         It is not a problem if the slice is only transient, as in if
>>         sys.platform[:4] == 'java' , which I agree is the only sort
>>         of slicing attested in the stdlib. If we imagine a slice
>>         getting a life of its own, the behaviour above would be very
>>         confusing, but I'm happy to risk that for now.
>>
>>         I think I have a decent implementation now of these several
>>         changes.
>>
>>         Jeff
>>
>>         Jeff Allen
>>
>>         On 27/11/2017 21:31, Stefan Richthofer wrote:
>>>         > I wonder why we write if sys.platform[:4] == 'java'
>>>         anyway. It's more work than startswith and you have to count
>>>         to 4 yourself.
>>>         It's a matter of fact that external libraries sometimes use
>>>         slicing for platform check. So far the implementation served
>>>         all use cases I found.
>>>         > suggest we expect PyShadowString(a, b)[m,n] to be
>>>         PyShadowString(a[m,n], b[m,n]), and I can see advantages to
>>>         that.
>>>         Good spot. Luckily this is mostly used for "win" which is
>>>         shorter than "java", but of course your version is safer and
>>>         more correct.
>>>         Please let's apply that.
>>>         > Also, if the targets list is mutable, I think each slice
>>>         must have its own.
>>>         I'm not sure if that is necessary. At least for all use
>>>         cases I know it is fine if the slice inherits or shares
>>>         target list from/with the original PyShadowString. Do I
>>>         overlook something?
>>>         *Gesendet:* Montag, 27. November 2017 um 21:33 Uhr
>>>         *Von:* "Jeff Allen" <ja...@fa...>
>>>         <mailto:ja...@fa...>
>>>         *An:* "Stefan Richthofer" <Ste...@gm...>
>>>         <mailto:Ste...@gm...>
>>>         *Cc:* "Jython Developers" <jyt...@li...>
>>>         <mailto:jyt...@li...>
>>>         *Betreff:* Re: PyShadowString test, ideas, questions
>>>
>>>         Thanks Stefan:
>>>
>>>         I'm happy with your answers. I have tried isBaseType = true
>>>         and can run the str tests now (passes of course). I've added
>>>         a *Derived.java file.
>>>
>>>         7. Slicing is interesting. Nice touch.
>>>
>>>         >>> s = PyShadowString("hello", "bonjour")
>>>         >>> s[:3]
>>>         'hel ( ==bon for targets )'
>>>
>>>         I appreciate the main use case is sys.platform[:n] , but
>>>         these were surprising (because the slice is only interpreted
>>>         relative to the main string):
>>>
>>>         >>> s[:7]
>>>         'hello ( ==bonjo for targets )'
>>>
>>>         >>> s[-3:]
>>>         'llo ( ==njo for targets )'
>>>
>>>         I suggest we expect PyShadowString(a, b)[m,n] to be
>>>         PyShadowString(a[m,n], b[m,n]), and I can see advantages to
>>>         that. Also, if the targets list is mutable, I think each
>>>         slice must have its own.
>>>
>>>         I wonder why we write if sys.platform[:4] == 'java' anyway.
>>>         It's more work than startswith and you have to count to 4
>>>         yourself.
>>>
>>>         Jeff
>>>
>>>         Jeff Allen
>>>         On 26/11/2017 10:16, Stefan Richthofer wrote:
>>>
>>>             Trying to give answers as far as I have them...
>>>
>>>             1. Should I be able to do this?
>>>
>>>             >>> s.gettargets().pop()
>>>             ('org\\.python\\.util\\.InteractiveConsole',)
>>>             >>> s == 'bonjour'
>>>             False
>>>
>>>             I think that's okay. target list is not private to alow
>>>             monkeypatching etc in usual Python fashion.
>>>
>>>             2. This seems like a useful "unconditional", but I
>>>             wonder if it is harmful:
>>>
>>>             >>> s.addtarget(None)
>>>             >>> s == 'bonjour'
>>>             True
>>>
>>>             No opinion. This behavior looks okay to me. Whoever uses
>>>             PyShadowString must be aware of entering evil zone anyway.
>>>
>>>             3. Confusing subject, but I'm not sure the exposure of
>>>             __eq__ is "canonical". I think the convention is __eq__
>>>             just wraps shadowstr___eq__, as done for startswith and
>>>             __repr__, but here there is "added value" in __eq__
>>>             (also in PyString.__eq__). shadowstr___eq__ will be
>>>             exposed as __eq__, and shouldn't PyShadowString.__eq__
>>>             do the same thing? What we need is complicated by the
>>>             wrapper PyObject._eq.
>>>
>>>             I think you're right. The check for isTarget should be
>>>             moved to shadowstr___eq__ I suppose. Sorry I overlooked
>>>             this when I wrote it.
>>>
>>>             4. Single-stepping through __eq__ I notice that a call
>>>             like s == 'bonjour' (i.e. a call to __eq__) is quite
>>>             costly, even when there are no targets that match. The
>>>             problem is in isTarget() and so I think I will try
>>>             changing the logic to:
>>>
>>>             return other==string || (other==shadow && isTarget())
>>>
>>>             but also, we can make isTarget() cheaper.
>>>
>>>             Thats' s a good improvement. Also, isTarget should
>>>             fail-fast if target list is empty (I thought I had done
>>>             it that way).On the other hand, performance is hardly a
>>>             priority here, because platform checks are usually
>>>             rather seldom, mostly on startup only.
>>>
>>>             5. I tried the tests of str on PyShadowString, to prove
>>>             it is also a str, but they fail because it cannot be
>>>             sub-classed. Is there a reason to prevent sub-classing?
>>>
>>>             This is because of isBaseType = false? We can remove
>>>             that. I just felt that shadow string contains enough
>>>             magic and should not be further extended, but there is
>>>             no concrete reason. We can allow subclassing.
>>>
>>>             6. And relatedly, when I was working on PyType I noticed
>>>             that it is possible to expose a Java sub-class as the
>>>             same Python type as its parent. This would mean that
>>>             type(PyShadowString("a", "b")) would be str even though
>>>             the behaviour is different. This sounds worth trying.
>>>
>>>             Hmm, I don't like that. ShadowString is not a string and
>>>             its type should be labled accordingly.
>>>
>>>             *Gesendet:* Sonntag, 26. November 2017 um 09:06 Uhr
>>>             *Von:* "Jeff Allen" <ja...@fa...>
>>>             <mailto:ja...@fa...>
>>>             *An:* "Jython Developers"
>>>             <jyt...@li...>
>>>             <mailto:jyt...@li...>, "Stefan
>>>             Richthofer" <Ste...@gm...>
>>>             <mailto:Ste...@gm...>
>>>             *Betreff:* PyShadowString test, ideas, questions
>>>
>>>             I've added a test for PyShadowString, our string that
>>>             can have two values at once. Trying to design a test
>>>             raises some questions.
>>>
>>>             >>> s = PyShadowString("hello", "bonjour")
>>>             >>> s == 'bonjour'
>>>             False
>>>             >>> s.addtarget(r"org\.python\.util\.InteractiveConsole")
>>>             >>> s == 'bonjour'
>>>             True
>>>
>>>             So far so good.
>>>
>>>             1. Should I be able to do this?
>>>
>>>             >>> s.gettargets().pop()
>>>             ('org\\.python\\.util\\.InteractiveConsole',)
>>>             >>> s == 'bonjour'
>>>             False
>>>
>>>             2. This seems like a useful "unconditional", but I
>>>             wonder if it is harmful:
>>>
>>>             >>> s.addtarget(None)
>>>             >>> s == 'bonjour'
>>>             True
>>>
>>>             3. Confusing subject, but I'm not sure the exposure of
>>>             __eq__ is "canonical". I think the convention is __eq__
>>>             just wraps shadowstr___eq__, as done for startswith and
>>>             __repr__, but here there is "added value" in __eq__
>>>             (also in PyString.__eq__). shadowstr___eq__ will be
>>>             exposed as __eq__, and shouldn't PyShadowString.__eq__
>>>             do the same thing? What we need is complicated by the
>>>             wrapper PyObject._eq.
>>>
>>>             4. Single-stepping through __eq__ I notice that a call
>>>             like s == 'bonjour' (i.e. a call to __eq__) is quite
>>>             costly, even when there are no targets that match. The
>>>             problem is in isTarget() and so I think I will try
>>>             changing the logic to:
>>>
>>>             return other==string || (other==shadow && isTarget())
>>>
>>>             but also, we can make isTarget() cheaper.
>>>
>>>             5. I tried the tests of str on PyShadowString, to prove
>>>             it is also a str, but they fail because it cannot be
>>>             sub-classed. Is there a reason to prevent sub-classing?
>>>
>>>             6. And relatedly, when I was working on PyType I noticed
>>>             that it is possible to expose a Java sub-class as the
>>>             same Python type as its parent. This would mean that
>>>             type(PyShadowString("a", "b")) would be str even though
>>>             the behaviour is different. This sounds worth trying.
>>>
>>>             Jeff
>>>
>>>             -- 
>>>             Jeff Allen
>>>
>>
>