Its hard to get a handle on how to communicate with the Czur USBVideo Class. Analyzing the pcap in Wireshark reports bcdDevice: 0x0100. The USB Video Class 1.1 example doc from 2005 make understanding the USB traffic somewhat easier.
Basically USB has many standards, which add up to an implementation in which a USB Host (the PC) communicates to a USB Device (the USB Video device) over a cable in a particular format.
The actual signalling involved on the USB wire is mostly ignored, since the operating system comes minimally equipped with device drivers for exporting a user program side interface that can be used to discover and send and receive "higher" protocol or "Class" protocol messages with a USB connected device.
Each "Class" of device has a set of "procedures" that roughly are designed to discover, query and setup that type of USB device using these "Class" protocols. They operate with a set of "assumptions" that are safe to make when you know your dealing with a particular device that belongs to that Class type.
A mouse or keyboard for example belong to the HID (Human Interface Device) "Class" and they are one of the simplest communications models to program.
A webcam belongs to the USB Video "Class" and is a lot more complicated.
The standard USB precursor to "Class" type communications first has to figure out which class the USB device belongs to, then load that class driver into memory and begin using Class protocols for it.
When monitoring or "sniffing" or "capturing" USB traffic by branching or inserting a "Tap" in the path between the operating system and the USB port driver. Everything comes out linearly, one after the other.. but each protocol message, read and write of a Class protocol message is conducted with a purpose in mind that serves the standard setup and operating procedure for that USB Class device.
The USB Specifications for the USB Class are very "wordily" explained.. in a somewhat backwards fashion beause the sentences while in English follow an Adjective or Description, before Object or Noun format. Which might be because historically "standards" originated in France and the language structure there is different than in other countries. Over time this has become the "normal" way of writing a standards document and makes it somewhat difficult for less experienced people to contemplate. -- this is not unique to USB standards, many things in Science and Art begin with a staged preamble regarding what will be talked about before getting around to discussing the subject matter. -- as a matter of course it appears "Indirect" and taxing on the casual "commoner listener" since they are usually less patient and more focused on the task at hand, not ten years from now.
But moving on.. with the task at hand.
The widely available USB_Video_Example 1.1.pdf document explains the "model" or assumed idealistic example of a webcam and what each linear procedure across the USB bus is doing after the USB Class driver is loaded and begins the task of setting up the device for communications. It also explains why things are they way they are and what purpose each call serves.
This is proving a useful document for understanding the Czur scanner.
Appearing as a USB Video 1.0 Class device, it has the standard interfaces one would expect with a few differences to account for more interfaces since the Czur scanner device also supports a few different compression and uncompression frame formats and methods of delivering those frames of information over USB.
Its been a really great motivation for me to better understand USB communications at the Class level which simpler classes rarely motivate one to do.
I suspect however that raw Class calls do not map directly back to the way the Native Czur scanner software communicates with the Czur scanner.
Rather, on Windows there is a long history going back to Windows 98, then ME, 2000, XP, Vista, then Win7, 8 and now 10 of declaring then abandonning various programming languages styles, APIs and program build tools.
Currently the USB Video Class came into being around the time of Visual C/C++ and the Video for Windows API era. That was supereced by the STI/WIA era which more or less emulated or copied the more widely used and still popular dedicated scanner API for many operating systems called TWAIN.
Then in Vista Microsoft re-thought, complicated, things by making WIA persona nongrata.. and removed most of the functionality for supporting the USB Video Class - Still Image support and banishing it to the DirectX/DirectShow "game development API".. because scanning is like "First Person Shooting, I guess? "
Which was scheduled for deprecation or dissolution by the Windows Media Foundation classes (for Movie making) but never quite got off the ground.
In the meantime.. C/C++ which replaced C/ASM development on old Windows, was itself depopularized or demonized by "Managed Code" after the altercations with Sun/Java and the emergence of the CLR and .NET family of languages.. but somehow.. they never bothered to port "legacy useful APIs" so they left it as an "exercise" for individual programmers themselves to "implement" Wrappers to produce code "Bindings" between legacy C/C++ libraries and .NET (so why is it we're doing .NET if its not good for anything ??)
So much has been left behind and so many people have left Microsoft in the last decade.. not to mention the little mobile problems.. that few outside MVPs outside the company seem to recall how to implement examples for things like AVCAP2.exe which included perhaps the oldest and only example of using the USB Video class for Still Image capture.
So...
Its speculation.. but using DirectShow GraphEdit (a Filter "programming" tool) to prototype using the Microsoft USBVideo.sys Class driver to communicate with the Czur scanner.. a "live" Rendering Window could be created.
Unfortunately a "functional" Still Image protype could not be made to "fire" or snag a full frame image because Microsoft deprecated the USBCmdButton API before it got very far.. and refer developers to instantiating the USB Video Controller and manually firing or hooking the device hardware interupt to actually "activate" the Still Image function on the scanner.
Its all rather frustrating.
The latest developments seem to be more people prefer using a Linux derived Kernel driver on the latest versions of Windows operating system Kernels because of the schizophrenic incomplete API hell developers appear to have to go through to create an application. [But] because Windows Kernel drivers must be signed, and must be signed by a cross signed code-developers signing certificate that costs hundreds of dollars a year to maintain.. the windows development for kernel drivers is grinding to a halt. -- I think in family dynamics this is called a "Habitual Dysfunctional Situation"
Its speculation, however the Czur native software appears to include several software development kit components or piece of frameworks. Notably avcodec, avdevice, avfilter, avformat, avutil -- would point towards a possible DirectShow or WMC connection for invoking the USB Video Still Image capture function. .NET libraries are no where to be found.
Video Lan Client or VLC is notable in that it can [also] invoke the USBVideo.sys Class library and acquire Live video at various resolutions..(invoking Dshow:// directly) but while it can capture Snapshots from its live feed.. it does not appear to have any built-in functionality for invoking the native USB Video Class Still image capture method.. which in theory should be much clearer and more stable.