I specifically mentioned PDAL in my original post because I think its architecture is uniquely well suited to forming a basis for what I have in mind.
For pipelines that can be executed in streaming mode there are effectively no memory limitations on the size of data that can be processed and for operations that do need to load the entire data set in to memory it contains plenty of tiling and decimation options to make that manageable by the user based on the hardware they have available. I think the ability to mix proprietary and open source stages in the same pipeline is also very important, as it is not reasonable to expect that all tools and algorithms will be released as open source. Similarly, by abstracting each "Filter" as a separate stand-alone stage in the pipeline filter authors have a great deal of freedom in how those individual filters are implemented. Some filters might be written in Python, some might run on GPU, some might require exotic libraries that are only available on specific platforms.
At the same time PDAL's biggest weakness is acknowledged right at the beginning of its documentation:
PDAL doesn’t provide a friendly GUI interface, it expects that you have the confidence to dig into the options of Filters, Readers, and Writers
I think that statement undersells the weakness. Effectively users need to define a data flow graph by authoring JSON files. Even for expert users that's not ideal. Although I don't recall seeing it discussed on the mailing list, I'd be surprised if the core contributors haven't thought about what a GUI might look like. I would suggest checking in with the PDAL developers to see if they have any thoughts on how a GUI should be implemented.
smacl wrote: ↑Tue Oct 23, 2018 8:24 amI think desktop is far more effective than server for most processing activity
I agree. There could certainly be value in preparing a long running process on a local workstation then sending it off to a local server farm to execute, but I think processes should be local first. The farther away from my workstation execution occurs the greater the costs in terms of security, transfer times, and storage. So far I haven't seen many compelling cloud processing services that overcome those increased costs so I don't have much interest in remote applications running inside my web browser.
We use Qt for our internal GUI development too. I haven't used it so I don't know if it is the best option, but this is the Qt library I have starred for building node graphs:
https://github.com/paceholder/nodeeditor
Even though I listed a 3D viewport as part of a minimum viable product in the original post, I don't think that it is actually needed to build a useful GUI for PDAL. A GUI node editor for authoring PDAL pipeline files would be useful even if those pipelines needed to be executed and the results viewed externally. Nonetheless, there are a few options to consider for a 3D viewport.
VTK seems like a reasonable choice and I used it for some small point cloud stuff several years ago. ParaView is built on VTK and is designed for massive data sets so it should be able to scale. Velodyne also has a basic application for viewing their lidar data that is built on top of ParaView.
CloudCompare is also an obvious place to look at to see how they handle point cloud rendering. I haven't dug in to it, but I believe they are mostly using builtin Qt libraries with their own octree acceleration structure to store the point data.
Thinkbox used the Ogre3D game engine for the 3D viewport when they built Sequoia so that could be another option to look at.
Regardless of the actual library used to do the rendering, I think the acceleration structure used underneath to feed the right points at the right time to the viewport is more important.