https://github.com/zufuliu/notepad2/commit/d4039b7e4fe1509f9ef8816a6b9ea95dfeaca532
Add UniqueCopy()
into UniqueString.h, which can be reused for Action::Create()
, UniqueStringCopy()
and UniqueStringSet::Save()
.
template <typename T>
inline std::unique_ptr<T[]> UniqueCopy(const T *data, size_t length) {
std::unique_ptr<T[]> copy(new T[length]); // std::make_unique_for_overwrite<T[]>(length)
memcpy(copy.get(), data, length * sizeof(T));
return copy;
}
UniqueStringCopy()
could be rewrote as:
UniqueString UniqueStringCopy(const char *text) {
if (!text) {
return {};
}
return UniqueCopy(text, std::char_traits<char>::length(text) + 1);
}
UniqueStringSet::Save()
can be changed into following to avoid compute string length again.
strings.push_back(UniqueCopy(sv.data(), sv.length() + 1));
return strings.back().get();
https://github.com/zufuliu/notepad2/commit/43c2718d4cb1531145aeec978368948452521bc8
Replace std::vector<char>
with std::unique_ptr<char[]>
, this would reduce code size a lot, especially for Document::FindText()
, where vector small size optimization may not works well (e.g. UTF-8 case insensitive search requires at least 33 bytes), though VarBuffer
like structure can be added for optimize for short search string (e.g. use 256 bytes stack for about 15 characters string).
All platform's implementation for ListBox::SetList()
can be changed into strlen()
+ UniqueCopy()
(e.g. Win32 LineToItem
).
https://github.com/zufuliu/notepad2/commit/6efaf40ada8804b67c4a3acbc81f00459f6f100c
Optimize Win32 ListBoxX
implementation to avoid repeated compute item text length, e.g. inside ListBoxX::Draw()
and ListBoxX::GetDesiredRect()
, the length is computed twice (strlen()
+ TextWide()
constructor).
The 4 bytes padding after ListItemData
can be used to store the text length computed inside AppendListItem()
. item text string length compute can be further optimized out by using pointer subtraction inside ListBoxX::SetList()
, so here only needs a single call to strlen()
.
I understand a desire to standardize on one approach to similar issues but don't think all of these are worthwhile. The patch does not apply cleanly to current Scintilla and contains the non-portable
__builtin_strlen
.In particular, the use of
UniqueString
for the undo stack is a low benefit change where there are large potential improvements to make. Many undo actions are just for a single byte (user typing or pressing delete multiple times) and the memory allocation for that is wasteful with an independent allocation of a single byte string. This is aligned up, generally to 16 bytes and allocators commonly use around 2 pointer/size items for management: so 32 bytes of memory to store 1 byte of data. As the undo stack is strictly LIFO, the text of all actions can be stored in one large allocation much more efficiently. Creating a dependence onUniqueString
(and an inverse requirement forUniqueString
to handle undo actions) is an unnecessary sideways step.UniqueCopy
usesnew
instead ofstd::make_unique
, perhaps (given themake_unique_for_overwrite
comment) to avoid initialisation butnew
always initializes. Having a templatedUniqueCopy
implies it will be instantiated over more types but its only used overchar
.Recalculating string length is insignificant when there was just a string compare loop and there is about to be an allocation.
Stack allocation for
searchThing
could be an improvement but I'd want to see measurements as its likely to be swamped by other code.For the list box, it may be worthwhile simplifying the code at the expense of some memory by shifting to
std::string
inListItemData
as its unlikely the benefit of the current approach is significant.https://github.com/zufuliu/notepad2/commit/29e8b146929ef368e64c408d3b06c8694f6b8819 and https://github.com/zufuliu/notepad2/commit/18aa425697bf6ffa718ae8ef16ec92ffcf994be0 implemented small string optimization for undo action by using structure padding (
sizeof(size_t) - 2
, so 2 bytes for 32 bit system and 6 bytes for 64 bit system), no custom move constructor is needed.UniqueCopy()
indeed not benefits much (only avoidedmemset
after call new).For undo action, maybe it could store short data inline?
Edit: the Action change does not work (random crash).
Last edit: Zufu Liu 2022-10-23
The crash is due to lacking custom move constructor (with copy constructor, copy/move assignment operators deleted).
Following is a small change (make
sizeof(Action) == 4*sizeof(size_t)
and default constructor asAction() noexcept = default;
) which makes Scintilla.dll 512 bytes smaller (VS2022 x64).Committed similar to
Action-1028.diff
as [341a21].Related
Commit: [341a21]
Following is some changes for CaseConvert.cxx
https://github.com/zufuliu/notepad2/commit/d862dda14bc08a87083f24d67d06db1aa7a7fe64
AddSymmetric()
andSetupConversions()
changed to member function ofCaseConverter
to remove extra conversion comparison.SetupConversions()
moved intoConverterForConversion()
to remove duplicated code. It might better to usestd::unreachable()
for the switch.maxConversionLength
increased to 7 to simplifyCharacterConversion
constructor.AddSymmetric()
andSetupConversions()
.Adding an extra byte so there isn't padding has no justification. The byte isn't going to be used. This appears to me to be motivated by a dumb linter that is pointing out where there is padding. If so, just turn it off.
This is copying garbage uninitialized memory (0xcc in MSVC debug) into conversion.conversion and making the code more fragile to changing requirements. Its a string: use its actual length. Isn't reading uninitialized memory like this undefined behaviour?
noexcept
onConverterForConversion()
needs to be removed.Updated comment for
maxConversionLength
to// use 7 to make sizeof(ConversionString)==8 and simplify CharacterConversion's constructor
.Use
memcpy()
forCharacterConversion
's constructor simplified the code (as the 8 bytes copy will be inlined),const char *conversion_
parameter is at least 8 bytes and always nul-termiated in bothAddSymmetric()
andSetupConversions()
.zeroed
char converted[maxConversionLength + 1]{};
inAddSymmetric()
is not needed asUTF8FromUTF32Character()
always puts a NUL.If
const char *conversion_
should be treated as untrusted string, thenmemcpy()
can be replace withstrcpy()
orstrncpy()
.Changes without touch ConversionString and CharacterConversion structures.
All the detail code for copying chars around is repetitive and easy to get wrong so it may be better to extend the use of
string_view
to raise the level of abstraction. Here's a version based on an earlier patch that usesstring_view
more for decoding the data.OK, your patch is modern than mine.
Changed title since the original didn't say anything
Committed updated
CaseConvert
changes with [50eec5].Improving undo is a larger project.
Couldn't see any clear benefits in changes to
UniqueString
/UniqueCopy
.Related
Commit: [50eec5]
Here is my currently used
Action
, that avoid allocation for small data.https://github.com/zufuliu/notepad2/blob/main/scintilla/src/CellBuffer.h#L35
The new UndoHistory is little slow, compare Notepad2's develop branch (with new UndoHistory) and main branch (use above Action class), replacing all dot to comma for
0N 9V.txt
(see [feature-requests:#1502], https://github.com/notepad-plus-plus/notepad-plus-plus/issues/10930#issuecomment-998760967, renaming the file to0N 9V.log
to use monospaed font), it's 50ms slow (tested on i3 and i5).apply following change to show time in command prompt:
Related
Feature Requests:
#1502