[PATCH v2 2/4] vbscript: Move the script context heap to the script dispatch object.

Thu Oct 24 08:32:34 CDT 2019

Hi Jacek,

On 10/23/19 9:16 PM, Jacek Caban wrote:
> Hi Gabriel,
> 
> On 10/21/19 1:40 PM, Gabriel Ivăncescu wrote:
>> The heap stores dynamic global variables added during the execution
>> phase. We'll need to reference them in the TypeInfo, and so we have to
>> reference count the heap itself, and the simplest way is to use the 
>> script
>> dispatch object to store it.
> 
> 
> It seems to me that you concentrated a bit too much at avoiding copying 
> the data rather than on root of the problem. It's possible to avoid 
> copying in different ways. The thing is that having to represent the 
> collection of global members in different ways for similar tasks 
> probably means that representation of the collection is not right. 
> Actually, we know that current global scope representation is not great. 
> A simple list was nice to have things done and working, but it's surely 
> not optimal. I don't expect you to rewrite it all, but it will be 
> eventually rewritten and it would be good to know that we're changing 
> things in the right direction instead of working on something that will 
> need to be rewritten for future improvements.
> 

Now that I've looked deeper into the vbscript implementation, I actually 
think the current one is wrong and don't quite understand how it even 
works, honestly.

It looks to me as the compilation phase that creates the vbscode_t that 
is then appened to the ctx->code_list adds functions and variables, and 
they are immediately linked to the context's global_funcs/vars 
respectively. This happens even *if* the code in question has 
"pending_exec" so it's not even executed immediately. Shouldn't those 
funcs/vars only be added to the global context list when they are being 
executed?

Furthermore, the funcs/vars during compilation are added to the 
vbscode_t's heap and stored there, while vars added during execution are 
added to the context's heap.

release_script will go through all global_vars to release their variant 
data, but in destroy_script, the code_list is released (and all 
vbscode_t heaps) *before* the script is released. Doesn't this mean some 
variables in the linked list will have already been freed (if they were 
part of some vbscode_t) and thus we're accessing freed memory when 
releasing the vars?

Overall I'm just slightly confused as to how it even works right now, so 
I've no idea how to improve it.

Here's my thought process on how to proceed, but I could be completely 
wrong.

First, store the global_vars/funcs in the script dispatch object, 
instead of the context (just like jscript does). Furthermore, when a 
code in the code_list is being executed, move it to the script dispatch 
object (for ref counting) and remove it from the context's list. (we'll 
essentially just have another list in the script dispatch object for 
code that has been executed).

This will allow to release the script by decreasing its state, releasing 
the code that was executed already, while *keeping* the code that is 
still pending, which I believe was the intent of the whole thing. Again, 
I could be misunderstanding this.

Secondly, link the compiled funcs/vars to the script dispatch object 
during the execution phase of that vbscode_t, instead of during 
compilation as we have now.

Does this seem a good way forward?

With this, the TypeInfo will just grab a reference to the script 
dispatch (which it has to anyway due making use of the ident_map) and 
possibly create arrays for vars/funcs like a snapshot, so they have the 
indices for the TypeInfo at the moment it is retrieved.

> 
> I don't know exactly how it should look like. It's even possible that we 
> will have to do some copies to capture the state after all. If I was 
> looking at it, I'd start with more experimentations. Your tests seem 
> like a good start. It would be interesting to see how things change when 
> you add more global members later. You tested that adding new global 
> variables doesn't change the existing ITypeInfo. How does the new one 
> look like? Does it append new variables after previous ones? What if we 
> added a new "dynamic" (non-explicit) variable? What if we mix it all? 
> How about functions? What do DISPIDs look like? Is it reasonable to try 
> to replicate native DISPIDs? Once we know more answers, we may decide 
> how the collection should look like.
> 

After more fiddling here's some bits of info:

The TypeInfo is frozen at the moment it is retrieved. Whatever happens 
to the script dispatch after that should not affect it. To simplify the 
explanation I will talk about "variables" but the same applies to 
functions as well.

If some variables are added and a new TypeInfo is retrieved, the new 
TypeInfo will contain those new variables, and their indices will be 
after the previous ones (i.e. they're appended).

I don't think it's worth replicating DISPIDs, because those are opaque. 
AFAIK no application should hard-code DISPIDs since they may also change 
between Windows versions (from experiments it does seem Windows tends to 
simply order the DISPIDs as they are found in the script, though, but 
separates vars/funcs).

However, the *indices* of the variables should probably be the same as 
on Windows. This is because you can obtain variables by index using 
GetVarDesc (again, same for functions). But, when it adds code, explicit 
variables are added first, then non-explicit variables. If more code is 
added, it will be appended to this, so they *can* be mixed. (e.g. first 
code adds 2 explicit vars, then 1 implicit var, second code adds 1 
explicit var and 2 implicit, it will be: e->e->i->e->i->i).

I think the cleanest approach would be for the TypeInfo to hold a ref to 
the script dispatch object, and all the necessary info to be stored 
there, just like it is done for jscript.

> 
> Once we know that, we should consider how to replace existing storage 
> with something suitable for ITypeInfo. The new collection would need to 
> be available for the interpreter and script dispatch (its GetIDsOfNames 
> should probably share the implementation with ITypeInfo one). 
> script_ctx_t may indeed be a good place. Another option could be 
> ScriptDisp or maybe another new struct. Note that we currently ignore 
> the case of closed script engine in other places (with things like 
> if(!This->ctx) return E_UNEXPECTED; in GetDispID) and until it's a 
> problem for an actual application, feel free to do a similar thing 
> (preferably with a FIXME). If we have to fix it, script_ctx_t is 
> intentionally done in a way that would be easy to make it ref-counted. 
> For that, however, it would be nice to see how both ITypeInfo, script 
> dispatch IDispatchEx and VBS class instance IDispatchEx behave on closed 
> script engines. Then we can decide how much of it do we want to keep 
> after releasing script engine itself.
> 
> 
> Thanks,
> 
> Jacek
> 

So, the TypeInfo is not affected by closing of the script engine.

The only problem I see is the separation of the code_list with the 
executed heap. In jscript I've done something similar, and in that case 
we *do* need to build an array regardless, because props can be deleted, 
while the TypeInfo remains and the var indices.

But in jscript a ref to the script dispatch can be held and the 
necessary refs to the function objects. I'd like it to be similar in 
vbscript, honestly, instead of having vars stored in multiple places.

Thanks,
Gabriel