[RFC PATCH v3 2/2] server: Add USD support with timestamp updates.

Tue Apr 28 12:11:42 CDT 2020

On 4/28/20 6:40 PM, Zebediah Figura wrote:
> On 4/28/20 10:11 AM, Rémi Bernon wrote:
>> Pages are created on demand, depending on the initial values the process
>> requests. These values usually include NT version information, processor
>> features and number of physical pages.
>>
>> The server maps the pages write-only and the clients map them read-only,
>> then the server updates the timestamps on every page using a 16 ms
>> timeout.
>>
>> Note that it is hard to split the timestamp updates to a separate patch
>> as it would require to map the pages RW on the client side to keep the
>> ntoskrnl updates working, and potentially have some issues with RW pages
>> being shared across clients.
>>
>> Signed-off-by: Rémi Bernon <rbernon at codeweavers.com>
>> ---
>>
>> Note that we could also do one page per process if simpler is better,
>> but that would also increase the work to do on every update.
> 
> One thing that occurs to me as being potentially desirable, if only 
> potentially, is to actually use these counters to implement our tick 
> count APIs, in which case we'd absolutely want to have one page per 
> process. I know we've already had applications that want GetTickCount() 
> to be extremely fast, and that'd allow us to get a little more Unix code 
> out of kernel32. Maybe there's something I'm missing, though.

Not sure to see why that would make one-page per process absolutely 
necessary? Is there any timer here that is process-specific? But yeah I 
had the same thing in mind for later, and it seems that on Windows the 
GetTickCount and so on are just reading from the USD, or at least the 
values perfectly match in the test.

>> +static void kusd_mapping_set_current_time( struct kusd_mapping *kusd )
>> +{
>> +    kusd->data->SystemTime.High2Time = (current_time >> 32);
>> +    kusd->data->SystemTime.LowPart   = (current_time & 0xffffffff);
>> +    kusd->data->SystemTime.High1Time = (current_time >> 32);
>> +
>> +    kusd->data->InterruptTime.High2Time = (monotonic_time >> 32);
>> +    kusd->data->InterruptTime.LowPart   = (monotonic_time & 0xffffffff);
>> +    kusd->data->InterruptTime.High1Time = (monotonic_time >> 32);
>> +
>> +    kusd->data->TickCount.High2Time    = ((monotonic_time / 10000) >> 
>> 32);
>> +    kusd->data->TickCount.LowPart      = ((monotonic_time / 10000) & 
>> 0xffffffff);
>> +    kusd->data->TickCount.High1Time    = ((monotonic_time / 10000) >> 
>> 32);
>> +    kusd->data->TickCountLowDeprecated = ((monotonic_time / 10000) & 
>> 0xffffffff);
>> +}
> 
> Do we want to use atomic writes here, to make sure the compiler does the 
> right thing?

Possibly and it's probably not a bad thing to add to be safe, indeed. 
The members are already volatile, so there should be no reordering 
between each write/read, and the correct way to read the data from what 
I could see is to compare High1Time with High2Time --but I guess there's 
some application not doing it right out there.

> 
>> +
>> +static void update_kusd_mappings( void *private )
>> +{
>> +    struct kusd_mapping *kusd;
>> +
>> +    add_timeout_user( kusd_timeout, update_kusd_mappings, NULL );
>> +
>> +    LIST_FOR_EACH_ENTRY( kusd, &kusd_mappings, struct kusd_mapping, 
>> entry )
>> +        kusd_mapping_set_current_time( kusd );
>> +}
>> +
>> +int get_user_shared_data_fd( const usd_init_t *init, unsigned int 
>> *size )
>> +{
>> +    static const WCHAR default_windirW[] = 
>> {'C',':','\\','w','i','n','d','o','w','s',0};
>> +    struct kusd_mapping *kusd;
>> +    int i;
>> +
>> +    LIST_FOR_EACH_ENTRY(kusd, &kusd_mappings, struct kusd_mapping, 
>> entry)
>> +    {
>> +        if (!memcmp(init, &kusd->init, sizeof(usd_init_t)))
>> +            goto done;
>> +    }
>> +
>> +    if (!(kusd = mem_alloc( sizeof(struct kusd_mapping) )))
>> +        return -1;
>> +
>> +    kusd->fd = create_temp_file( sizeof(*kusd->data) );
>> +    kusd->data = mmap( NULL, sizeof(*kusd->data), PROT_WRITE, 
>> MAP_SHARED, kusd->fd, 0 );
>> +    kusd->init = *init;
>> +
>> +    kusd->data->NumberOfPhysicalPages = init->number_of_physical_pages;
>> +    kusd->data->NtMajorVersion = init->nt_major_version;
>> +    kusd->data->NtMinorVersion = init->nt_minor_version;
>> +    kusd->data->NtProductType = init->nt_product_type;
>> +    for (i = 0; i < 64; ++i)
>> +        kusd->data->ProcessorFeatures[i] = init->processor_features[i];
>> +
>> +    memcpy( kusd->data->NtSystemRoot, default_windirW, 
>> sizeof(default_windirW) );
>> +    kusd->data->TickCountMultiplier = 1 << 24;
>> +
>> +    kusd_mapping_set_current_time( kusd );
>> +
>> +    if (list_empty( &kusd_mappings )) add_timeout_user( kusd_timeout, 
>> update_kusd_mappings, NULL );
>> +    list_add_tail( &kusd_mappings, &kusd->entry );
>> +
>> +done:
>> +    *size = sizeof(*kusd->data);
>> +    return kusd->fd;
>> +}
>> +
>>   /* create a file mapping */
>>   DECL_HANDLER(create_mapping)
>>   {
>> diff --git a/server/process.c b/server/process.c
>> index 73984f363f59..6bb18ec6617a 100644
>> --- a/server/process.c
>> +++ b/server/process.c
>> @@ -1385,6 +1385,8 @@ DECL_HANDLER(init_process_done)
>>       if (req->gui) process->idle_event = create_event( NULL, NULL, 0, 
>> 1, 0, NULL );
>>       if (process->debugger) set_process_debug_flag( process, 1 );
>>       reply->suspend = (current->suspend || process->suspend);
>> +
>> +    send_client_fd( process, get_user_shared_data_fd( get_req_data(), 
>> &reply->usd_size ), -1 );
>>   }
>>   /* open a handle to a process */
>> diff --git a/server/protocol.def b/server/protocol.def
>> index 06a29b153ea0..4a935fd7efcf 100644
>> --- a/server/protocol.def
>> +++ b/server/protocol.def
>> @@ -787,6 +787,16 @@ struct rawinput_device
>>       user_handle_t  target;
>>   };
>> +/* Initial values for user shared data */
>> +typedef struct
>> +{
>> +    unsigned int   number_of_physical_pages;
>> +    unsigned int   nt_major_version;
>> +    unsigned int   nt_minor_version;
>> +    unsigned short nt_product_type;
>> +    unsigned char  processor_features[64];
>> +} usd_init_t;
>> +
>>   /****************************************************************/
>>   /* Request declarations */
>> @@ -854,8 +864,10 @@ struct rawinput_device
>>       mod_handle_t module;       /* main module base address */
>>       client_ptr_t ldt_copy;     /* address of LDT copy (in thread 
>> address space) */
>>       client_ptr_t entry;        /* process entry point */
>> +    VARARG(usd,usd_init);      /* initial USD values */
>>   @REPLY
>>       int          suspend;      /* is process suspended? */
>> +    unsigned int usd_size;     /* size of USD mapping */
>>   @END
> 
> Maybe it would make more sense to instead pass the whole initial 
> KUSER_SHARED_DATA structure as a VARARG to the server. That lets the 
> server be entirely agnostic as to what's in it aside from timestamps, 
> and makes it even easier to set new fields later.
> 

Yeah, it seemed a little bit overkill at first. An I didn't want to 
bother including wdm.h here, but why not. As I was doing page 
de-duplication, it felt nicer to know which information was actually used.

>> diff --git a/server/trace.c b/server/trace.c
>> index 2b58ed9fd2c0..31560eaad827 100644
>> --- a/server/trace.c
>> +++ b/server/trace.c
>> @@ -860,6 +860,28 @@ static void dump_varargs_startup_info( const char 
>> *prefix, data_size_t size )
>>       remove_data( size );
>>   }
>> +static void dump_varargs_usd_init( const char *prefix, data_size_t 
>> size )
>> +{
>> +    usd_init_t init;
>> +    int i;
>> +
>> +    memset( &init, 0, sizeof(init) );
>> +    memcpy( &init, cur_data, min( size, sizeof(init) ));
>> +
>> +    fprintf( stderr,
>> +             
>> "%s{number_of_physical_pages=%u,nt_major_version=%04x,nt_minor_version=%04x," 
>>
>> +             "nt_product_type=%02x,processor_features={",
>> +             prefix, init.number_of_physical_pages, 
>> init.nt_major_version, init.nt_minor_version,
>> +             init.nt_product_type );
>> +    for (i = 0; i < ARRAY_SIZE(init.processor_features); ++i)
>> +    {
>> +        if (i > 0) fputc( ',', stderr );
>> +        fprintf( stderr, "%02x", init.processor_features[i] );
>> +    }
>> +    fprintf( stderr, "}}" );
>> +    remove_data( size );
>> +}
>> +
> 
> In which case you'd probably just get rid of this.
> 
>>   static void dump_varargs_input_records( const char *prefix, 
>> data_size_t size )
>>   {
>>       const INPUT_RECORD *rec = cur_data;
>>
> 
> 

If one page per process is preferable, then it can actually be passed to 
the client earlier, with possibly the version lookup done on the server 
side, and maybe the cpuinfo and page count as well if the values are 
identical for 32bit and 64bit processes?

So if it's possible, there would be no need to pass anything from the 
client to the server. If not, then it's just a matter of passing cpu 
features and page count, which I feel is better than passing the whole 
initial USD.
-- 
Rémi Bernon <rbernon at codeweavers.com>