Threadripper

Discuss all Leica Cyclone, Cyclone REGISTER 360 & Cyclone FIELD 360 software here.
Post Reply
dhirota
V.I.P Member
V.I.P Member
Posts: 958
Joined: Sun Nov 01, 2009 11:18 pm
14
Full Name: Dennis Hirota
Company Details: Sam O Hirota Inc
Company Position Title: President
Country: USA
Linkedin Profile: Yes
Location: Hawaii, USA
Has thanked: 87 times
Been thanked: 379 times

Re: Threadripper

Post by dhirota »

smacl wrote: Mon Mar 25, 2019 6:08 pm Hi Dennis,

Great work, those speeds are impressive. It would be interesting taking the same data into a range of different software and seeing the comparative performance. It is also worth considering that some software, e.g. potree and anything else building out of core data structures, compromise initial import speed to improve performance scalability across huge data
....

What the above tells me is that point cloud I/O is considerably more CPU bound than disk bound when translating from a point cloud data transfer format, yet opening native formats for any given program typically is not.
Your observation about native formats may not be true in every instance, but that is the reason I suggest to application software developers to use the vendor SDK to improve I/O loading speeds as well as other features.
Dennis Hirota, PhD, PE, LPLS
www.samhirota.com
[email protected]
dhirota
V.I.P Member
V.I.P Member
Posts: 958
Joined: Sun Nov 01, 2009 11:18 pm
14
Full Name: Dennis Hirota
Company Details: Sam O Hirota Inc
Company Position Title: President
Country: USA
Linkedin Profile: Yes
Location: Hawaii, USA
Has thanked: 87 times
Been thanked: 379 times

Re: Threadripper

Post by dhirota »

smacl wrote: Mon Mar 25, 2019 6:08 pm Same data on recap took 105 minutes, resulted in 19gb of storage on disk and opens again in about 20 seconds.
I did the same imported file (516M pts, 17GB LAS) into Recap Pro to see how Autodesk was processing it. It looked multi-threaded at times, but appeared the usual single-threaded processing. The import, indexing, launch, and display (a few seconds) took 94 minutes without registration since they were registered using Riegl's GNSS RTK. A non-productive 90 minute wait compared to the Sequoia import and visualization in the same 4K display.

I spoke to the last remaining person that I know at Autodesk to see when we might see an improvement in I/O speed. The response was that Autodesk is moving to the cloud with their partner applications, so do not expect to see local processing with what we are discussing in this thread any time soon.

It looks like one may have to become a software developer to load and process your information locally faster, or get Shane to do it for us.
Dennis Hirota, PhD, PE, LPLS
www.samhirota.com
[email protected]
jamesworrell
V.I.P Member
V.I.P Member
Posts: 537
Joined: Mon Jun 16, 2014 1:45 pm
9
Full Name: James Worrell
Company Details: Bennett and Francis
Company Position Title: Director
Country: Australia
Linkedin Profile: Yes
Location: Brisbane, Queensland, Australia
Has thanked: 14 times
Been thanked: 87 times
Contact:

Re: Threadripper

Post by jamesworrell »

If I keep talking about it - it might become a reality ;-p

We need to move to multi-node processing. Things like tiling, meshing, machine learning/AI workloads, pre-processing, noise filtering, initial conversion/ingest, publishing conversion/compression, running constraints .. these are all workloads that would benefit with multi-node engines each given discrete units of work.

ContextCapture Centre Edition, Sequoia - these systems are using queueing and engines running on multiple boxes. There is your inspiration.

RAID works - redundant array of INEXPENSIVE disks .. we really want the same for poing cloud processing. A basic box with reasonable GPU with some solid state drives and a decent whack or RAM (mostly for caching clouds) - and a smart queue manager - and throughput with idle boxes could be enormous.

Your office of 20 workstations suddenly gets a whole lot more productive if working as a single engine. Naturally you would want priorty queing - so ingest might be lower in the queue than say running constraints with meshing higher up etc.

As for the engine - Microsoft Project Orleans would be interesting. 10gbit/sec networking would be help - or at least dual links.
User avatar
smacl
Global Moderator
Global Moderator
Posts: 1409
Joined: Tue Jan 25, 2011 5:12 pm
13
Full Name: Shane MacLaughlin
Company Details: Atlas Computers Ltd
Company Position Title: Managing Director
Country: Ireland
Linkedin Profile: Yes
Location: Ireland
Has thanked: 627 times
Been thanked: 657 times
Contact:

Re: Threadripper

Post by smacl »

jamesworrell wrote: Tue Mar 26, 2019 6:38 amIf I keep talking about it - it might become a reality ;-p
It certainly makes a lot of sense and will no doubt happen, but for us poor code monkeys it means rewriting a lot of complex code in many cases. Some processes, such as photo stitching and rendering video naturally break down into separate computational units as they're working with locally independent data. Other processes, such as building spatial indices (e.g. TIN meshes and octrees) are more difficult to solve using a 'divide and conquer' mechanisms as the data is heavily interdependent and likely to get modified by multiple processes at the same time. In this scenario the synchronization and communications can actually make the multi-threaded solution slower than the single threaded one if not very carefully implemented and tested.

Local cloud computation clusters are definitely the near future, but we still have a bit to go yet. For most people at this point in time, I recommend multiple mid-range workstations rather than fewer very expensive ones, as the processing time is only an issue it if is tying up the operator. If the operator can remain productive using another workstation, this isn't a problem. Personally, I find remote control over the LAN a great benefit here and am typically using three PCs at a time. I like what sequoia are doing in terms of bundling a couple of compute licenses in with a main license and may nab this idea for SCC, as it doesn't seem entirely reasonable to lock up an expensive license for importing/exporting/publishing and similar grunt background work.
Shane MacLaughlin
Atlas Computers Ltd
www.atlascomputers.ie

SCC Point Cloud module
jedfrechette
V.I.P Member
V.I.P Member
Posts: 1237
Joined: Mon Jan 04, 2010 7:51 pm
14
Full Name: Jed Frechette
Company Details: Lidar Guys
Company Position Title: CEO and Lidar Supervisor
Country: USA
Linkedin Profile: Yes
Location: Albuquerque, NM
Has thanked: 62 times
Been thanked: 220 times
Contact:

Re: Threadripper

Post by jedfrechette »

jamesworrell wrote: Tue Mar 26, 2019 6:38 am If I keep talking about it - it might become a reality ;-p

We need to move to multi-node processing.
Stop talking and start doing. ;-) We're in the early stages of getting this setup internally. Everything you need to do it is available right now. All you need is a job que manager and processing software that can do something useful in batch mode.

I suppose each application could develop its own job que system, but that seems very inefficient compared to using a general purpose job manager to orchestrate batch jobs. Job managers like Deadline and OpenCue already have all the tools needed to prioritize jobs based on certain criteria, make sure they go to compute nodes with appropriate hardware, and retry them when they fail, etc.

The other component is having processing software where you can say:

Code: Select all

program --do --something
from a command line and get a useful result. For us that is going to be the likes of PolyWorks, PDAL, CloudCompare, Agisoft, and Houdini.

Although I doubt they made the design decision with distributed processing in mind, one of the things I noticed while testing Scene 2019 is that that the multithreaded preprocessing they've introduced seems to be implemented as a bunch of subprocess calls to individual worker programs. If they exposed a scripting interface that allowed the user to do the same thing it would work really well with this type of distributed compute model.
Jed
User avatar
smacl
Global Moderator
Global Moderator
Posts: 1409
Joined: Tue Jan 25, 2011 5:12 pm
13
Full Name: Shane MacLaughlin
Company Details: Atlas Computers Ltd
Company Position Title: Managing Director
Country: Ireland
Linkedin Profile: Yes
Location: Ireland
Has thanked: 627 times
Been thanked: 657 times
Contact:

Re: Threadripper

Post by smacl »

jedfrechette wrote: Tue Mar 26, 2019 2:44 pmThe other component is having processing software where you can say:

Code: Select all

program --do --something
Additionally, you could look to see if your program has an automation interface and drive it through your preferred scripting language, or simply use an automation tool such as AutoIT, see https://www.autoitscript.com/site/autoit/ The only problem with chaining programs together on the command line is that they tend to start and finish by reading your point cloud from a disk. In some cases this can be a very big overhead, in which case scripting makes more sense as your point cloud stays in memory in its native format.
Shane MacLaughlin
Atlas Computers Ltd
www.atlascomputers.ie

SCC Point Cloud module
User avatar
Jason Warren
Administrator
Administrator
Posts: 4224
Joined: Thu Aug 16, 2007 9:21 am
16
Full Name: Jason Warren
Company Details: Laser Scanning Forum Ltd
Company Position Title: Co-Founder
Country: UK
Skype Name: jason_warren
Linkedin Profile: No
Location: Retford, UK
Has thanked: 443 times
Been thanked: 246 times
Contact:

Re: Threadripper

Post by Jason Warren »

jedfrechette wrote: Tue Mar 26, 2019 2:44 pm Although I doubt they made the design decision with distributed processing in mind, one of the things I noticed while testing Scene 2019 is that that the multithreaded preprocessing they've introduced seems to be implemented as a bunch of subprocess calls to individual worker programs. If they exposed a scripting interface that allowed the user to do the same thing it would work really well with this type of distributed compute model.

A `Backburner` for Scene 2019 would be cool... :like:
Jason Warren
Co_Founder

Dedicated to 3D Laser Scanning
LaserScanningForum
Scott
V.I.P Member
V.I.P Member
Posts: 1037
Joined: Tue Mar 29, 2011 7:39 pm
13
Full Name: Scott Page
Company Details: Scott Page Design- Architectural service
Company Position Title: Owner
Country: USA
Linkedin Profile: No
Location: Berkeley, CA USA
Has thanked: 206 times
Been thanked: 78 times
Contact:

Re: Threadripper

Post by Scott »

Jason Warren wrote: Tue Mar 26, 2019 5:49 pm
jedfrechette wrote: Tue Mar 26, 2019 2:44 pm Although I doubt they made the design decision with distributed processing in mind, one of the things I noticed while testing Scene 2019 is that that the multithreaded preprocessing they've introduced seems to be implemented as a bunch of subprocess calls to individual worker programs. If they exposed a scripting interface that allowed the user to do the same thing it would work really well with this type of distributed compute model.
A `Backburner` for Scene 2019 would be cool... :like:
I just watched FARO's recent webinar, that I'd missed. Part of it covered core numbers and speed. I did learn some new things from the 60min webinar, so I recommend it, technical glitches and all. ~Scott

Leveraging the SCENE evolution (webinar) -sign-in required
https://insights.faro.com/construction- ... gnId=3109
jamesworrell
V.I.P Member
V.I.P Member
Posts: 537
Joined: Mon Jun 16, 2014 1:45 pm
9
Full Name: James Worrell
Company Details: Bennett and Francis
Company Position Title: Director
Country: Australia
Linkedin Profile: Yes
Location: Brisbane, Queensland, Australia
Has thanked: 14 times
Been thanked: 87 times
Contact:

Re: Threadripper

Post by jamesworrell »

jedfrechette wrote: Tue Mar 26, 2019 2:44 pm Stop talking and start doing. ;-)
My quals are actually in information technology although I work for a surveying firm - go figure .. have coded for 30+ years. Meh, whilst it would be really interesting, I'm not in the startup game! Certainly a fun time to work on the code side of point clouds - meshes, AI, cloud compute, open graphics (eg 3d Tiles), decent wide area networking (5G), photogrammetry, big data (eg consider the full stream of data from Toyota or other laser equipped cars).

https://www.spatialsource.com.au/latest ... -bandwagon

Vendors should take a bit of a leaf out of Microsoft's book and make everything API-first. Windows Server is a great example with basically everything surfaced within powershell.
User avatar
lastools
V.I.P Member
V.I.P Member
Posts: 144
Joined: Tue Mar 16, 2010 3:06 am
14
Full Name: Martin Isenburg
Company Details: rapidlasso - fast tools to catch reality
Company Position Title: creators of LAStools and LASzip
Country: Germany
Skype Name: isenburg
Linkedin Profile: Yes
Been thanked: 1 time
Contact:

Re: Threadripper

Post by lastools »

Hello from @LAStools,

Yep. LASzip decompression on a single core is typically CPU bound (unless the data comes straight across the Internet). But LASzip was designed with parallelism for decompression in mind such that the use of multi-cores when decompressing a single LAZ file is theoretically possible. It's just a matter of writing (and then maintaining) the code. Each "chunk" of 50000 points is compressed independently from all other chucks, so that 4, 8, 16, 48, or 128 chunks could be decompressed simultaneously if you have fast file read access and many cores. However, someone (you?) would need to implement a multi-threaded decompressor. All the code is open source so go right ahead ... (-:

Regards,

Martin @rapidlasso
Post Reply

Return to “Leica Cyclone, Cyclone REGISTER 360 & Cyclone FIELD 360”