Jump to content

Is 2.5 now CPU multi-core?


Mr_sukebe

Recommended Posts

zS5p5HO.png

 

(Joke aside, this thread bouncing thing is most probably not limited to DCS, but maybe other programs use tricks to make threads stick to a certain core most of the time)

 

 

Without Lasso I would get a micro stutter when shooting down a KA-50 with the harrier A/A.

With Lasso I don't so surely that is major positive for 4 core users?

 

Can you please confirm whether or not you are using power saving features in BIOS.

 

If you feel you can do it. Disable all of them and disable process lasso or set it to default affinity, ALL, then try again, while your CPU is running max without any downlocking.

 

If it does work fine, then it's not about the bouncing of threads, but the fact that the cores have to spool up just as ED community manager mentioned here: https://forums.eagle.ru/showthread.php?t=118535&highlight=CPU+AFFINITY

 

Nope, it's not multi-core.

2-4 max

 

Very funny.:doh:

 

DCS has been multi-threaded and multi-core for a long time.

 

The industry as I have ranted many times across other venues, has sometimes horrible terminology. They are using the term multi-threading for what is a standard day in the office of using multiple threads on a single core, in which, for so many threads to work at once, each one of them gets to run for a short amount of time, and because this is going so fast, it creates an illusion of more separate things executing "simultaneously". There's nothing simultaneous or parallel about it, it's all a fake trick and the context and terminology fails to relay that to the end user.

 

The program just uses threads the same way today as it did before multi-core CPUs, for increased performance they need to parallelize the parallelizable workload and that needs special programming but also involves the use of multiple threads as that's how OS Kernel scheduler can put those threads on separate cores.

 

Why has DCS been multi-core? Because we're using english language and anything over 1 is multi, however english language still half-recognizes the idea of duality, so more specifically DCS would have "dual-core support" or "optimized for dual-core" if you would put it, but it's all horrible marketing-type terms, don't suppose to be taken literally. If you have a single-core CPU, those two threads would ran on a single core ofcourse, which means it would work as if it was one thread, because one core can only support one thread. (I'm ignoring HT for now, it's not equiv to another physical core anyway, 20-30% perf increase in practice)

 

The industry also uses wrong terms, or wrong context when communicating with the customers, probably on purpose to stay on the same page as the customers, when they say "multi-threading support", they really mean "we'll offload some work that's on this thread, which is really busy and taking a lot of the resources, onto another thread so the OS Kernel scheduler can put it to another core and run it simultaneously".

 

This is the right way to understand multithreading. However, using separate thread(s) for some task can help improve overall performances. For example, runing sound process within a dedicated thread can help, because sound is matter of streaming buffers highly related to time, and it's better to let this working its own way separated from the main application loop...

 

 

 

I doubt the Audio is on a "different core". It is highly probable that audio process is in a separated thread, but not a separated core. The simple reason is that: I don't know any way to dedicate a particular process/task to a particular core... nothing in Windows API allow that... If this is possible, i'll be curious to know how this is done.

 

I haven't included the clues about audio into the first picture above, but briefly, I've seen mentions of audio on the third thread which is the first one below 1% usage, this is all speculation tho so I probably won't try to put any theories up just yet, there was obviously no sounds being played in this test, so it would make sense in one way, but DCS seems to be taxing CPU/GPU similarly even if it's paused but I find that hard to believe even tho it looks like it but would need a proper benchmark which would run for a bit longer.

 

well, at least i know of at least one application, that won't utilzie virtual cores (HT), unless specifically told to do so. so there must be at least some control on application level...

 

i also remember when HT multi-core cpus became a thing in mainstream gaming, some games would struggle, because they would f.e. use one physical core and it's virtual twin for parallelization (hope that's the correct term), instead of using two "full" cores.

this problem got solved at some point, but i am not sure, if this was solved through OS or per application.

(you sometimes still have people advising for deactivating HT for gaming. i think this still dates back to the original struggle. on modern games, i never observed any benefit from deactivating HT...)

 

The OS handles this. The programmer decides how many threads to spawn and what to do in them, the OS determines where to run them, whether on separate cores or not AND which cores at any given time. There's a scheduler in the OS that allows threads to run and it can interrupt them when it needs those cores to do other things (update the clock, redraw the screen, play an email alert etc). You can set an app's priority (task manager, process lasso etc), i.e. higher priority -> interrupt other stuff before this one or CPU affinity -> try your best to only use THESE CPUs (cores in our case but the OS see them as CPUs nonetheless)

 

Yes, the OS Kernel Scheduler also needs to put the biggest threads on a SEPARATE PHYSICAL core, not just a separate HW Thread that gets reported in Task Manager as if it was a real physical core. I heard the max benefit of HT is like 50%, but more like 20-30% in practice by other people who corrected me, so it's really a HUGE gimmick that's being reported to the OS transparently as just another hardware core, if that's the case, that's so wrong.

 

DCS runs on 2 threads, Multithreaded optimizations have not been implmented from what I can tell. Windows will try optimize any application running on a single thread across multiple cores, it will attempt to "distribute the load" as it sees fit. This is whats happening here.

 

My understanding of DCS is that it runs everything on 1 thread, and sound on another thread.

 

As an example I wrote a simple app that creates a thread and quite literally infinite loops to saturate the processor, this single thread can be seen running on all of the cores on my machine.

 

cpu_example.gif

 

As you can see from the image, all 4 cores are working, but not to their full extent, this is typiical. It basically tops out at around 25%, and 4 cores... so quick math 1 / 4 = 0.25, yup, they are distributing the load. Now assuming I added a 2nd thread, we would probably see 50% utilization. Hope this helps.

 

Code for reference (if you are into that kind of thing)

class Program
   {
       
       static void Main(string[] args)
       {
           Thread t = new Thread(DoWork);
           t.Start();
           Console.Read();
       }

       private static void DoWork()
       {
           while (true)
           {
               // Just work hard.
           }
       }
   }

 

Well you're right about the 2 threads, if we ignore the rest which are below 1% for all practical uses.

 

Secondly, that term is what get's people confused. I would avoid using it. We should be using the term something like "splitting the parallelizable workload to multiple threads which could then be put to multiple cores" - the shorter you get the more context and description it loses and marketing folks love short confusing terms.

 

Thirdly, I think what you're saying about "spreading the workload" is exactly what my image here is revealing, it's all fake, it's thread bouncing, creating an illusion, just like multiple pictures create illusion of movement. (not to be mixed with context switching). The OS Kernel Scheduler can't possibly magically split the work inside one thread to run on multiple cores at once. The process thread it self jumps from one core to another.

 

It's been a while since one of the developers made this statement. So there should be even more progress meanwhile.

 

So the old "lore" that DCS uses only two threads does not strictly hold true any more.

 

Yes an No. Yes because they didn't bother to mention the rest of the practically irrelevant thread. And No because it looks like there's only 2 out of many threads that take most of the CPU usage.

 

If you would want to say it in marketing terms: DCS is optimized for dual-core. How much is it optimized, that's another question, from what it's been in the air that audio and I/O are separate.

 

Now, simulation/physics is said to be unparallelizable* and because DCS is a simulator no matter who much the rest is parallelized and put to more threads, DCS will still be running one thread the most and it will still be a bottleneck, just a lesser one in future, Vulkan API by default would ease the load as well, because it's draw-calls cost much less, there's no CPU optimizations needed for this, Vulkan API gives it for FREE.


Edited by Worrazen
  • Like 1

Modules: A-10C I/II, F/A-18C, Mig-21Bis, M-2000C, AJS-37, Spitfire LF Mk. IX, P-47, FC3, SC, CA, WW2AP, CE2. Terrains: NTTR, Normandy, Persian Gulf, Syria

 

Link to comment
Share on other sites

  • Replies 116
  • Created
  • Last Reply

Top Posters In This Topic

It's been a while since one of the developers made this statement. So there should be even more progress meanwhile.

 

So the old "lore" that DCS uses only two threads does not strictly hold true any more.

 

 

Absence of an update in info would tend to imply nothing changed. They also haven't commented in the last few days if we were getting a modern Russian fighter, but the answer for years is no so it's safe to assume it's still no.

 

Last time anybody commented on it, I think it was Matt less than a year ago and he said no with no plans to do it . This topic comes up every couple weeks, AT LEAST. I'm sure they just get tired of repeating the same answers 47 times, each time like it's never been brought up before =)

Де вороги, знайдуться козаки їх перемогти.

5800x3d * 3090 * 64gb * Reverb G2

Link to comment
Share on other sites

[...] Now, simulation/physics is said to be unparallelizable* [...]

Excuse me? The jobs of thousands of theoretical physicists, chemists, meteorologists, geologists, material scientists, etc. doing massively parallel simulations on thousands of cores are at stake if you spread that news.

  • Like 1

A warrior's mission is to foster the success of others.

i9-12900K | MSI RTX 3080Ti Suprim X | 128 GB Ram 3200 MHz DDR-4 | MSI MPG Edge Z690 | Samung EVO 980 Pro SSD | Virpil Stick, Throttle and Collective | MFG Crosswind | HP Reverb G2

RAT - On the Range - Rescue Helo - Recovery Tanker - Warehouse - Airboss

Link to comment
Share on other sites

Excuse me? The jobs of thousands of theoretical physicists, chemists, meteorologists, geologists, material scientists, etc. doing massively parallel simulations on thousands of cores are at stake if you spread that news.

 

Well said. :)

  • Like 1

Control is an illusion which usually shatters at the least expected moment.

Gazelle Mini-gun version is endorphins with rotors. See above.

 

Currently rolling with a Asus Z390 Prime, 9600K, 32GB RAM, SSD, 2080Ti and Windows 10Pro, Rift CV1. bu0836x and Scratch Built Pedals, Collective and Cyclic.

Link to comment
Share on other sites

Excuse me? The jobs of thousands of theoretical physicists, chemists, meteorologists, geologists, material scientists, etc. doing massively parallel simulations on thousands of cores are at stake if you spread that news.

 

Most (if not all) of these simulations are not done in real-time and are not interactives. Softwares design and purposes are not the same, also the kind of simulation is not the same, they are very specialized to a particular task, they are not (less) limited by memory and real-time management.

 

In a game physics engine, you optimizes constantly to select what must be computed and what don't need and you can have a lot of exception. For example, you don't comput the collision test of a bullet to another body the same way than a collision of a body with another or with the ground... Also, paralelizing things like collision test is not straitforward. Using paralelization, you - for example - send two buffers of raw 1000 values as input, one algorithm to comput something 1000 times with this pair of 1000 values and you get an output of 1000 resulting values... This is ok to update positions of a bunch of particles, but to test a collision, this is inadapted, because the data you need is conditionnal and "an exception" by definition:

Is there a colision (True or False) ? If yes, where it is ? (what vertex, at what velocity, at what position. Parallelization is not designed to output the result of a conditional branching, where output data is not the same format and nature than the input ones, this is designed to comput the same thing, X times, and output X results of the equation.

 

I don't mean this is impossible, but, when you are about to implement such paralelization, you ask yourself: Is there a real benefits implementing paralelization of thousand of useless computs done quickly than regular way, instead of optimizing the regular way to avoid useless computs by conditionnal testings and get directly what i need to continue my process ?


Edited by sedenion
Link to comment
Share on other sites

Forgot to note in picture: There's other threads that are still in red color which I didn't put a specific color on, I just picked the biggest one and put it to green.

 

HNjKUXq.png

 

I figured I could joke with it while I'm at it, but don't take it as written in stone, there could be some kind of a reason for thread-bouncing that I might haven't figured out yet and is generally less understood out there, I'll see if this thread bouncing thing is also present with HT enabled. My CPU is SandyBridge-E and that's quite some time back for Win10 to support the topology of the processor, why would it thread bounce so much.

 

Later after I did that post I figured, while not many PC guru or programmers knew about it and were asking me, I found PCPER and WCCFTECH talked about it in depth. They also mentioned the thing I was discussing elsewhere while in the middle of this, the idea that the OS transparently sees HT cores as if they are just more physical cores, that would be so wrong considering HT core is nowhere near equivalent to a physical one as it makes it appear to an average person looking at task manager, it could be considered downright misleading !!! These stupid simplification do nothing to help consumers it only confuses them more and ofcourse gives a false impression of Windows putting your CPU to good use.

 

So, I'm not even that knowledgable about the details but I have logically were suspecting what if the OS Kernel scheduler is putting those 2 DCS cores onto the same physical hyperthreaded core, that would be a pretty bad idea. And Task Manager doesn't tell you which pairs are from which CPU core, it's up to guessing the integers. That's were CPU Affinity comes in so you make sure to force usage of different physical cores.

 

The below links talk about the AMD's CPU, I'm not comparing this with that, it's about the explanation of the thread-bouncing thing and other behavior.

 

https://www.pcper.com/reviews/Processors/AMD-Ryzen-and-Windows-10-Scheduler-No-Silver-Bullet

 

 

https://wccftech.com/amd-ryzen-performance-negatively-affected-windows-10-scheduler-bug/

 

 

https://superuser.com/questions/26240/why-is-a-single-thread-spread-across-cpus

 

 

So I think I could write these 3 *conclusions* from all of this:

  • Don't want to have 2 big threads on single physical core (applies only for HT/SMT)

  • Don't want big threads bouncing at fast rates between physical cores and/or even worse between CCX
  • Thrad bouncing is NOT proper load balancing, it's a fake artificial illusion, same as the illusion of constant light from a 50/60hz off-on cycle in AC home lighting, proper load balancing is OSKernel separating hungry threads to multiple physical CPU cores, and only populating the HT pairs later when all physical ones become occupied, and not mistaken for the context switching which happens on a very deep level and is a normal part of computers.

 

Excuse me? The jobs of thousands of theoretical physicists, chemists, meteorologists, geologists, material scientists, etc. doing massively parallel simulations on thousands of cores are at stake if you spread that news.

 

That's why the statement was noted by * and I forgot to apply itallic font.

 

Some types of work are hard to paralellize and takes way more programming effort and linking the work across cores also comes at a cost which could diminish the return from having it paralell.

 

Absence of an update in info would tend to imply nothing changed. They also haven't commented in the last few days if we were getting a modern Russian fighter, but the answer for years is no so it's safe to assume it's still no.

 

Last time anybody commented on it, I think it was Matt less than a year ago and he said no with no plans to do it . This topic comes up every couple weeks, AT LEAST. I'm sure they just get tired of repeating the same answers 47 times, each time like it's never been brought up before =)

 

Ladies and Gentlemen we have a winneeeeeeeer!

 

But there's still optimizations that aren't immediately visible, inside those two* threads, Caucasus map being a lot beefier would hide those. My FPS average is like 20 FPS lower, but that's to be expected, Vulkan API might correct all of it and then some more!

 

The whole multicore topic thing is just attractive chocolate desert that nothign can stop it, but let's remind ourselfs the 2.5 release is really good, I watched like 40 videos of first impressions, way more than I expected.

 

I didn't do all the combinations yet in testing so I need to be careful not to set anything in stone right now, with the first test it's 2 threads, but that's was one very specific test only, need to make sure if playing audio changes anything.


Edited by Worrazen

Modules: A-10C I/II, F/A-18C, Mig-21Bis, M-2000C, AJS-37, Spitfire LF Mk. IX, P-47, FC3, SC, CA, WW2AP, CE2. Terrains: NTTR, Normandy, Persian Gulf, Syria

 

Link to comment
Share on other sites

Wader8, thanks for spending the time with Windows Performance Analyzer to explain how DCS runs on our systems. Very good info there!

Steve (Slick)

 

ThrustMaster T.Flight Hotas X | TrackIR5 Pro | EVGA GTX 1070 | Win10 64-bit Professional | Dell Precision 7920 Workstation | 1 TB SSD | 128 GB Memory | Dual Intel Xeon Platinum 2.0 GHz 16 Core Processors (64 Total w/HT ON) | 24" Dell Monitor

Link to comment
Share on other sites

I haven't included the clues about audio into the first picture above, but briefly, I've seen mentions of audio on the third thread which is the first one below 1% usage, this is all speculation tho so I probably won't try to put any theories up just yet, there was obviously no sounds being played in this test, so it would make sense in one way, but DCS seems to be taxing CPU/GPU similarly even if it's paused but I find that hard to believe even tho it looks like it but would need a proper benchmark which would run for a bit longer.

 

I don't think you will observe an huge difference with or without sound. I don't know how DCS sound engine is programmed, but, speaking of my own personal experience, the need to put sound process in its own thread is not about heavy CPU usage, but about time and buffer switching management with varying FPS... Sound process needs buffers to be switched at regular time (streaming), if you include this process withing your main loop (where FPS can be high or low depending situation, system, etc), Its like trying to synching two different worlds (a one can retard the other)... Putting sound with its own "while(true)" clockwork in a separated thread solves the problem, even if its not "multi-core".


Edited by sedenion
Link to comment
Share on other sites

If DCS game process would thats end is already known, then it would be easily possible to be split to multiple threads.

 

Example, a video that is going to be rendered is super easy to split to multiple threads. You can do it multiple way like every core/thread gets a own frame to render, then they jump to next unprocessed and they just keep going until all the frames in the video sequence is processed.

 

With images that is super easy, you already have the resolution and you have the effect, so you can split the image to areas, to rows etc and process all separately.

 

In 3D rendering it is little more difficult as there becomes the order sensitive of process, like you can't draw shadows until light ray modeling is done. You can't do light ray modeling until each object surface features are calculated. So if single light ray doesn't bounce (light hits to object and it gets color and shadow), it is super easy. If the light ray bounces one time (light hits to object A, reflects it to object B) then it becomes more difficult. What happens if the raytracing is 7-14 bounces? Still easy if it is just single light ray, but what if there is 10 000 light rays? You need a lot of time to calculate that.

 

DCS problem is that it is real-time game. Something happens, it affects to everything else and it needs to be calculated quickly in specific time period. Not too much time can be given or it takes too much. So you easily just start something and if it doesn't get in a time, it is dropped and something is estimated.

 

That is basically what a Oculus ATW (Asynchronous Time Warp) that it takes the frames that has long delay and estimate what should have been between and fills it with false information! Now you don't have a slideshow but smooth rendering, but with annoying visual warping if estimation went wrong! So no headache but looks bad.

 

In DCS the developers are right that they would not benefit so much about multi-threaded programming. As so long the main delay and calculation causer is something that can't be split to parts with final outcome, it can't be distributed.

 

A > B > C > D > E > F

1 > 2 > 3 > 5 > 8 > 22 > 75

 

If you do not know how after 3 becomes 5 and how after that becomes 8 and 22 and 75, you can't say to core 1 that "render you the 8, 22 and 75, I render 1, 2 and 3" because 1 needs to be before 2, and that before 3 and that before everything other.

 

So to put it in real world context.

 

DCS as game looks like a final product. But computer is doing the product line in real time from the specs that developers has coded with modifications how player is flying and doing.

 

You can't assemble the final product in assembly line, if half of the parts are missing, because that part production line has too much required to be done.

So everyone is waiting the slowest link. No matter how fast everyone else is, if that one can't do it any faster, then everyone else is waiting.

 

So why to spend time to program a multi-threaded process, when the outcome is unknown and every other thread would anyways be waiting that time critical thread?

 

If you could get 1-5% or even 10% performance benefit from that, but have nasty side effects? Is it worth it?

 

Think about it... If you now get 44 frames per second, 10% performance increase would be just 4 frames! So now you get 48 frames at best possible scenario!

 

44 frames vs 60 frames, that is something, but it is already 36% increase!

 

So think about it like this in DCS.

We fly and we drop a bomb on target. We see how the bomb flies and hits.

 

What we expect? We expect to see a explosion.

How many has forgot to choose nose/tail fuze before dropping a Mk-82, waiting to see a big explosion but instead see a small sand "puff" when the few hundred kilograms just went couple meters inside the ground? Is it a nice effect?

 

So how would you feel if you would drop a bomb, fuze set correctly, see the bomb hit the ground, and nothing? Just few seconds after that you see a explosion!

 

What would you think about it? That would come from multi-threaded system with allowing delay to happen without expense of heavy CPU load, as the calculation would take its own time how explosion happens and then when it is done, it is shown.

 

Target has moved away from the area, or the target just suddenly explodes etc.

 

It would be so nasty and laggy experience that it is unbelievable.

So we have a CPU hit just when the bomb explodes when all is calculated and rendered in realtime, while still dropping the frames and all.... Even if we would have multi-thread in that, all other threads would be waiting that single core responsible for that calculations to do it as quickly as possible.

 

So why to do the effort when all would still be waiting it to happen?

 

But SOME could be improved. Like make a ground AI separate so when that core can't calculate pathfinding, LOS etc right, they stop moving all, but your aircraft would fly straight and smooth.

 

But that adds again same problems as you can see in online FPS shooter, when a player A with 500ms lag comes to play with those who has 10ms lag.

 

A player A jumps behind the corner in front of someone else, they see that someone else either 500ms before and can shoot, or he sees the other player 500ms later and is dead.

 

There is no win-win situation there when another is coming with a lag, a old information.

 

 

And all this is similar challenge that missile and radar engineers need to solve. Lots of delays, lots of echos, lots of timing critical calculations that a high speed object needs to do against unknown moving target.

That is why missiles don't have high Pk, why you don't shoot in specific modes against specific kind target from specific range in specific weather and attitude etc as you are just lowering the Pk. You don't know exactly how the filtering and math goes by the algorithms, but there is so complex things happening that it really is amazing we have a radar guided missiles.

 

This is the most interesting part in the nature, that how can a evolution have helped the individuals to adapt and survive so biological being can do very complex tasks without thinking, that requires very heavy calculations for computer. Some can be simplified and "cheated" but most things are still requiring heavy calculations in time sensitive manner.

 

And it all requires we can split every simple task to smaller pieces, understand what there is happening and then make a math how we get 1 or 0 from it and then write algorithm for it and then write a computer program code around that algorithm in a manner that does what is wanted, but does it in the time that is critical. And that is as flexible and difficult as writing a poem that needs to have a fifteen words, and every poem would be different that different people would write. So you don't need just experts on field mathematicians, physicists, programmers, but you as well need philosophers and logistics.

 

And hardest part is to get them to agree...

 

https://en.wikipedia.org/wiki/Dining_philosophers_problem

i7-8700k, 32GB 2666Mhz DDR4, 2x 2080S SLI 8GB, Oculus Rift S.

i7-8700k, 16GB 2666Mhz DDR4, 1080Ti 11GB, 27" 4K, 65" HDR 4K.

Link to comment
Share on other sites

cnI48QS.png

 

 

cRmGQE0.png

 

Because AMD's Ryzen 8Core is a 4x4 Core design, bouncing the thread from one set to another (between CCX) has more latency than Intel's CPUs and previous AMD CPUs. So you don't want your windows to thread-bounce, unfortunately the trend of the industry of sacrificing single-core performance for more multi-threaded workloads is not helping this type of workload (gaming) market. Some of these industry players should band together with some kind of a group effort to put a foot down and raise this problem with the big manufacturers.

 

Wader8, thanks for spending the time with Windows Performance Analyzer to explain how DCS runs on our systems. Very good info there!

 

Also, I would like to re-point again that some of this things may not be DCS specifics, it could happen in any other game. Just to clear out, because I'm not doing this in some attempt just to find things to blame on DCS on purpose; not.


Edited by Worrazen

Modules: A-10C I/II, F/A-18C, Mig-21Bis, M-2000C, AJS-37, Spitfire LF Mk. IX, P-47, FC3, SC, CA, WW2AP, CE2. Terrains: NTTR, Normandy, Persian Gulf, Syria

 

Link to comment
Share on other sites

I don't think you will observe an huge difference with or without sound. I don't know how DCS sound engine is programmed, but, speaking of my own personal experience, the need to put sound process in its own thread is not about heavy CPU usage, but about time and buffer switching management with varying FPS... Sound process needs buffers to be switched at regular time (streaming), if you include this process withing your main loop (where FPS can be high or low depending situation, system, etc), Its like trying to synching two different worlds (a one can retard the other)... Putting sound with its own "while(true)" clockwork in a separated thread solves the problem, even if its not "multi-core".

 

Separating sound to own core is easy solution to avoid half of the experience problems.

You can accept a FPS drop to 15 when bomb hits and explodes, but if you would hear that bomb to explode in jittery fashion, it would be extremely annoying.

 

So just getting a trigger when the sound of explosion should happen, and play that sound smoothly regardless what you see on screen, is very simple solution to fix annoyance as player brains will combine the sound and the visuals as one and it helps to experience that things goes "smoothly".

 

And as you don't need to delay anything really, it is easy task to do! Works great!

i7-8700k, 32GB 2666Mhz DDR4, 2x 2080S SLI 8GB, Oculus Rift S.

i7-8700k, 16GB 2666Mhz DDR4, 1080Ti 11GB, 27" 4K, 65" HDR 4K.

Link to comment
Share on other sites

Also, I would like to re-point again that some of this things may not be DCS specifics, it could happen in any other game. Just to clear out, because I'm not doing this in some attempt just to find things to blame on DCS on purpose; not.

 

Completely understood. I just enjoyed the education of those tools that you used since the default Task Manager view I use does not provide enough information.

Steve (Slick)

 

ThrustMaster T.Flight Hotas X | TrackIR5 Pro | EVGA GTX 1070 | Win10 64-bit Professional | Dell Precision 7920 Workstation | 1 TB SSD | 128 GB Memory | Dual Intel Xeon Platinum 2.0 GHz 16 Core Processors (64 Total w/HT ON) | 24" Dell Monitor

Link to comment
Share on other sites

Correction:

 

CPU Affinity setting obviously is not a sure way to control thread bouncing or the fake "Microsoft Load Balancing".

 

The only way to prevent it is to set CPU Affinity to single core. But that's a bad idea in DCS as well because as the following video which is uploading, will greately show in practice, that DCS benefits from a second CPU Core, but not beyond that.

 

As soon as more than one CPU core is available the OS will start jumping/bouncing around those DCS threads between cores so it will again deceptively look like as if the "load is spread over 2 cores evenly" giving an average gamer the impression that 2 threads are using the 2 CPU Cores equally, unfortunately all false.

 

Secondly, I obviously am not saying anything definitive on the performance impact around thread-bouncing on the actual games and DCS ... just to reiterate that if someone thought that I was doing all those big posts as if the thread-bouncing is affecting performance, no, I haven't tested it and quite frankly don't know if it's testable because of the mentioned reasons, unless we can get user control of the Windows kernel.

 

To clear out possible confusion, there's like 3 semi-separate things going on in this forum thread:

 

 

  • Talk around what multi-core, multi-threading, parallelization is.
  • Talk around whether DCS is taking advantage of multiple CPU cores.
  • Talk around the thread-bouncing/jumping/Load-Balancing phenomenon.

 

 

--------------------------------------------------

--------------------------------------------------

 

 

Here is a look to some of the tests I did during the multi-core test video, showing DCS's threads being forced onto a single thread:

 

XC266gh.png

 

Found out some things from the system could mess with the DCS threads like that, at first I thought it was normal, while I obviously wasn't running anything else in the background except barebones, I thought stuff would be separated to the 4 cores, but if DCS is really primary task, it shouldn't use the most busiest DCS core for anything else at all.

 

Now, isolated, that's how it's suppose to work on one CPU Core, one thread pauses for the other do a little bit too, if we mention the basics.

 

But in the big picture, I started looking and thinking and then it clicked, everything else isn't being separated to 4 cores the way I thought it should, everything else get's thread-bounced or "load balanced" as well and lands on the 4th Core which had DCS in that test.

 

remaN3G.png

I should have seen that one coming :doh:

 

I didn't immediately have an idea what the heck to do with it, so someone quickly told me Process Priority could help in cases like this, but wouldn't it help for DCS, as well as the unusual measure of setting all other existing processes affinity away from the primary DCS core or both.

 

So there's 2 ways of doing it, but only the second option is the sures:

 

  1. Set DCS CPU Affinity to 2 physical CPU Cores and set DCS priority (in this case with HT off all CPU numbers are physical)
  2. Set DCS CPU Affinity to 2 physical CPU Cores and set all other processes CPU Affinity away from those 2 CPU Cores that DCS is using.

----------------------------------------------------

----------------------------------------------------

More speculative talk on the OS Scheduler (not necessairly specific to DCS) - take with possible grain of salt.

 

Again I'm not talking about performance around thread-bouncing, IDK for sure.

 

But it's just silly that totally unrelated processes threads would be bouncing between cores like that to somehow "ease the load", the load should be the same whether the a big thread is bounced or put all in one core, and then it's the question how much the perf cost of the bounce is, if there is no practical perf difference then if it's a heat spreading thing it should be noted/advertised as such and provided user option to turn it off, if not that then it comes down just to the taste of to how you present the info and I obviously don't want the info to be presented in an unrealistic statistical fashion along with averaging ontop, averages are always deceptive numbers, because they don't exist in nature physically, they're artificial constructs by living things made through observation through certain time period, FPS is also an average, that's why a shift of focus to frametimes etc. The thread should move when there's a genuine balancing reason.

 

Basically, my biggest problem with this "easing-the-load" like how it's promoted is the confusion it creates, is that it deceptively sounds that it somehow makes the CPU faster and uses all those cores, logically it just doesn't make sense that it has such an effect when the the same thread is being bounced around.

And pretty much every average user probably thought the same, now, well maybe it could be genuine failure to properly explain it by the industry, but maybe I could safely say that the practical fact is that most people got duped!, including me. right?

 

I do get how genuine thread balancing is meant, if we ignore the same-thread-bouncing, in a scenario for example, you split 40 processes with 1 thread each, and we stick these 10 threads on CORE1, if 10 of these threads get really busy it would fill CORE1 CPU time, so it would be genuine to move some of them to other CPU Cores along with the rest of the threads which may each need lower amount of CPU Time, then you would see some bouncing going around, but I suspect, much lesser frequency than the current behavior.

 

So most circumstances when CPU isn't fully loaded, if this thread-bouncing thing was such a big deal as Microsoft/Windows/CPU manufs advertise it, then you would see a performance difference, but we don't. We only get a performance difference bettwen 1 or 2 CPU core in DCS and that's all about the second thread DCS uses for presumably audio and IO.

 

The proper test would be putting DCS to 2 or more cores and making the thread's stick to particular cores without using CPU affinity, I'll see if priority alone helps in this regard.

 

But I don't want to make conslusions, I want to also be thinking from the opposite side of finding anything that could support thread-bouncing as a valid feature.

 

And right now the only other 2 reasons for thread-bouncing is the following theory ... read slowly ...

A thread that's using CORE1 has been paused for another thread to do it's work on CORE1 and is scheduled to finish it's chunk at a later time while the CORE2 is free to accept new work before that time and the scheduler determined it would be worthwhile moving the sleeping thread and it's data to CORE2 rather than waiting for CORE1 to be available again?

 

But does it really work like that ? Something I should do more research about.But I'm getting busy with some other work for a while so I'll get back to this later.

 

A thought in a nutshell:

If the primary goal of the OS Scheduler is to keep all CPU cores equally loaded (what reason? marketing gimmick? perf? powersave?), which means it would have to involve bouncing the same threads of all processes around in order to help achieve that, IMO that surely is NOT a goal that has maximum single-core game performance in mind.

Finally, the way OS Scheduler works may with all the behaviors it has scripted, not be compatible with such a specific workload as for example DCS, which means, it would need a different set of rules, custom to such scenario.Maybe it's normal that we all need to fiddle with CPU Priority and CPU Affinity to achieve the proper game-mode effect and no manufacturer or OS vendor told us about that?Considering single-core perf is stagnating for many years and business priority is on parallel cloud computing and enterprise power saving stuff, might not be far fetched theory at all.

 

-----

-----

 

EDIT:

ONE KEY THING about terminology/context:

It's is load balancing the CPU Cores yes, if you use that term with regards to CPU Cores - but when you look at it with regards to one process and moving the one thread around and bouncing it, now that is the fake load-balancing that I was talking about.


Edited by Worrazen

Modules: A-10C I/II, F/A-18C, Mig-21Bis, M-2000C, AJS-37, Spitfire LF Mk. IX, P-47, FC3, SC, CA, WW2AP, CE2. Terrains: NTTR, Normandy, Persian Gulf, Syria

 

Link to comment
Share on other sites

I haven't tested it and quite frankly don't know if it's testable because of the mentioned reasons, unless we can get user control of the Windows kernel.

 

Actually, i guess the thread-bouncing is not even a Windows kernel behavior, but default behavior of chipset/firmware. I guess kernel only request some control to firmware to force this or that... My theory is that this behavior, as said before allow to reduce power consumption, but, if you read between lines, this litterally mean: Preventing the CPU to overheat.

Link to comment
Share on other sites

I have since totally rewritten the second part of my post above, due to newer findings and understanding.

 

Actually, i guess the thread-bouncing is not even a Windows kernel behavior, but default behavior of chipset/firmware. I guess kernel only request some control to firmware to force this or that... My theory is that this behavior, as said before allow to reduce power consumption, but, if you read between lines, this litterally mean: Preventing the CPU to overheat.

 

Maybe it could be all about spreading the heat across all 4 cores, ... but that's not the kind of OS Scheduler strategy that we would want for such types of single-thread critical workloads as DCS. Did we finally broke the spell ?

 

I think that as long as thread-bouncing (the same thread moving across many cores all the time) doesn't majorly affect performance for that thread/process that's being bounced, it's probably not a big deal in most cases of general computer use, where you don't have any particular program that relies mostly on one thread.

People that read this thread should understand we're digging through the smalles minute things right now and shouldn't need to worry about half of it.

 

------

So it's two things to create the best case scenario circumstances:

 

  • First you need to make your top priority thread not to bounce to other cores which you leave for system/background. (AFAIK there is no ability to set affinity/priority per-thread in windows, so the DCS audio thread and main thread will still bounce between each other)
  • Then you need to make everything else not bounce into it.

e8xEUce.png

 

Would need the ability to set CPU Affinity on a per-thread level

something I don't know if is possible under Windows 10

If someone thinks it can be done, let me know. Then we could

see in a real test whether or not pure thread-bouncing has any

performance benefit at all.

 

-------

-------

Another hunch is, what if, when it bounces, it already completed piece of work so it doesn't need to move any data from cache to cache, because it would start fresh on a new instance of calculations anyway?

 

EDIT: Here's another question, what if some of the bouncing is necessary, like one thread needing data, or that never happens, it's just waiting for cache update? .. another thing for me to read up on.

 

EDIT2: Looks like it is possible to specify affinity per-thread in Win32 API (but not for user evironment): https://msdn.microsoft.com/en-us/library/windows/desktop/ms686247(v=vs.85).aspx


Edited by Worrazen

Modules: A-10C I/II, F/A-18C, Mig-21Bis, M-2000C, AJS-37, Spitfire LF Mk. IX, P-47, FC3, SC, CA, WW2AP, CE2. Terrains: NTTR, Normandy, Persian Gulf, Syria

 

Link to comment
Share on other sites

Thanks for revealing this in a way that can't be disputed.

I'm glad you aren't getting torn apart by the fan club yet.

 

So in conclusion, nothing in this regard has significantly changed from two years ago. :-(


Edited by aairon

Flying sims since 1980

 

[sIGPIC][/sIGPIC]

Mobo: Asus Z170 Pro Gaming

CPU: i7 6700K @ 4.7 GHz

Video: EVGA GTX 1080

Ram: Patriot DDR4 2800 8GBx2

PWR:Corsair RM750i

Link to comment
Share on other sites

Thanks for revealing this in a way that can't be disputed.

I'm glad you aren't getting torn apart by the fan club yet.

 

So in conclusion, nothing in this regard has significantly changed from two years ago. :-(

 

Heh, I'm just doing this also because I like it, this is just one case out of many, not necessairly to hunt something down around DCS. This was a great adventure becaue in the process I learned key things I didn't knew before, so there I was surprised for a bit, I might have digged to hard into the thread-bouncing, but it doesn't seem to be that big of a performance deal.

 

I was hunting if that's causing some of the microstuttering, which with 2.5 and a SSD I don't really noticed yet, but I wasn't looking either, the 10 hours I played I was in enjoy-mode and i didn't notice anything, except for a few hitches, like 3 or 4 for 0.1 or 0.2 second or something which is probably the asset loading thing but stuff like that is really nothing versus before in Win7 and a HDD.

 

Continuing, I wasn't that long around to know deeply how it was before, and going into this right now I wasn't expecting any different,

 

As others in previous pages said, nothing around this has been announced or detailed.

 

Plus Vulkan API is just around the corner, and that would also be the best place to see some CPU optimizations which would make Vulkan API a bit pointless if the CPU would be left alone, this is actually correct, they don't want to waste resources on optimizing the older environment with the DX11 API.

 

So this is not the time to be sad, you'll get to see some other feature or module or Vulkan API faster because of the saved resources by not optimizing the existing environment. Ofcourse I could be wrong and when Vulkan API is released you'd have an option to choose which API to use, and there could be *some* CPU optimizations that would work for both, this is pure speculation.

 

Vulkan API is open and gets updated unlike the closed down DirectX, but also a developer has more lower level access, there is more opportunities for optimization later on, it's not like performance will be set in stone once DCS gets it's first Vulkan API release.

 

Here's that video of the comparison between 1, 2, 3 and 4 CPU cores:

It's a bit slow since it's hard to hold phone with one hand, the moment is somewhere in the middle.

Modules: A-10C I/II, F/A-18C, Mig-21Bis, M-2000C, AJS-37, Spitfire LF Mk. IX, P-47, FC3, SC, CA, WW2AP, CE2. Terrains: NTTR, Normandy, Persian Gulf, Syria

 

Link to comment
Share on other sites

So in conclusion, nothing in this regard has significantly changed from two years ago. :-(

 

This is pretty much what we were told several years ago. No surprise.

ASUS ROG Maximus VIII Hero, i7-6700K, Noctua NH-D14 Cooler, Crucial 32GB DDR4 2133, Samsung 950 Pro NVMe 256GB, Samsung EVO 250GB & 500GB SSD, 2TB Caviar Black, Zotac GTX 1080 AMP! Extreme 8GB, Corsair HX1000i, Phillips BDM4065UC 40" 4k monitor, VX2258 TouchScreen, TIR 5 w/ProClip, TM Warthog, VKB Gladiator Pro, Saitek X56, et. al., MFG Crosswind Pedals #1199, VolairSim Pit, Rift CV1 :thumbup:

Link to comment
Share on other sites

Here's that video of the comparison between 1, 2, 3 and 4 CPU cores:

 

So with that video it looks like CPU utilization and frames go up from 1 to 2 processors but does not change when you add more. I wonder if any more info could be learned by adding the CPU Time, Threads, and Handles columns on the Task Manager Details tab?

Steve (Slick)

 

ThrustMaster T.Flight Hotas X | TrackIR5 Pro | EVGA GTX 1070 | Win10 64-bit Professional | Dell Precision 7920 Workstation | 1 TB SSD | 128 GB Memory | Dual Intel Xeon Platinum 2.0 GHz 16 Core Processors (64 Total w/HT ON) | 24" Dell Monitor

Link to comment
Share on other sites

So in conclusion, nothing in this regard has significantly changed from two years ago. :-(

 

The most important thing to understand I think, is that nothing can really be significantly changed. The debate about multi-core CPU and game exists since the firsts dual-core CPU (10 years ago), with always the same "myths" and the same answere. There is no way to really exploit the multi-core to improve performance of such game-like application, even if the application is multi-threaded. Multi-Core CPU were designed both for commercial purpose and considering some technical limitation about voltage/heat (this is my opinion).

 

What multi-core CPU provide is simply another computing unit which spread load between cores to allow the CPU to run at very high clock frequency without burning (this is how I interpret the Intel's and AMD's multi-core strategy)... The small "plus" is that this allow some OS/Thread related optimizations running at low-level, done by the hardware/firmware in a automated way, that's all, the "illusion" stop here. DCS is probably already optimized at its best considering the real terrain of technical limitation and APIs... Simply enjoy this, and stop beleive a game can magically exploit the super full potential of 8-cors CPU using a super programming technics, this is myths, partially maintened by CPU manufacturers.


Edited by sedenion
Link to comment
Share on other sites

This topic is hard to define as there are many reasonable results, all seem valid but do have a huge span across their values. Then, there is multithreading and hyperthreading and multicore and SMP, DCS as one executable, DirectX11 as the second big player involved when actually executing dcs.exe. Add VR or streaming, the two heavy weights in other tasks you may have engaged, plus all the other small apps that one may run aside DCS, be it TS,SRS,MSI Afterburner, Mainboard-OC-Suite, Antivirus, SoundManager, etc.

 

I have monitored my system usage throughout the 2.5 release and I myself have very variable results when it comes to CPU usage. I have had flights with ProcessLasso and 6 cores being pretty well used, pure 2D, not even TrackIR, the most simple setup actually, and I have flown VR and 2D with much less CPU usage as opposed to my 1st findings. This puzzles me a bit.

Tho through ProcessLasso all the work is on the 6 real cores, and it seems that the 4 cores that cannot be used by DCS ( 2 cores only ) are more or less evenly used by DX11. This is an assumption, I have no DX knowledge and from reading it up, DX11 is capable of using...and that I dont know, HOW MANY cores ? 2, or 4. Where does this sometimes 40-50% load on all those 4 cores come from. It can't be DCS, it cant be VR, or any of the other as they werent in use, AV was OFF, I usually turn it all off, regardless of how many cores I will ever run, its an old habit from Win95 days I guess. Can DirectX11 use 4 cores to about 50%. Could it use them all 4 at 100%, given a GPU that could use it ? Would such a setup make sense for future GPUs ?

 

Anyway, with Vulkan on the horizon, we are beating an almost dead horse whe it comes to "what will DCS use, what makes sense for dedicated DCS rigs?".

 

Still, it would be nice to know what DX11 can actually use. I dont like guessing such things at all. You either know it or you dont, I dont.

Gigabyte Aorus X570S Master - Ryzen 5900X - Gskill 64GB 3200/CL14@3600/CL14 - Asus 1080ti EK-waterblock - 4x Samsung 980Pro 1TB - 1x Samsung 870 Evo 1TB - 1x SanDisc 120GB SSD - Heatkiller IV - MoRa3-360LT@9x120mm Noctua F12 - Corsair AXi-1200 - TiR5-Pro - Warthog Hotas - Saitek Combat Pedals - Asus PG278Q 27" QHD Gsync 144Hz - Corsair K70 RGB Pro - Win11 Pro/Linux - Phanteks Evolv-X 

Link to comment
Share on other sites

... I have had flights with ProcessLasso and 6 cores being pretty well used, ...

 

Are you sure those weren't just a few thread's being bounced around ? ;)

 

Now, the information is correct when it comes to CPU Core's themselfs, yes, the cores are being utilized the amount it says, the "fake" that I meant is because of that idea it was giving off.

 

But see everyone's being tricked into thinking that when they run a certain big program that their program is "multi-threaded and spreading load" or that "windows is spreading the load to multiple cores" and it's all falsely fueling that the multi-core CPU was a worthwhile purchase, I'm not here saying we should stick with dual-core, just that it would be more transparent, because it would be understandable, hardware is half the time a step ahead of software. So it's giving a huge false impression, a standard task manager should have another graph which would clear this issue out, nothing too detailed just a "top 10 threads" or something and the % of CPU they use, with the process name from which those threads come from, you'd immediately see 2 threads doing most work for DCS, for example.

 

Maybe in future these manufacturers should seriously think about creating CPUs where you get 2 dedicated high-speed cores which are meant to be used for really busy threads and it could all be standardized where they would be specifically designated for special use and that it would signal to the OS to not put anything on there unless it's specifically written in source code that the core is to be used, so you would have most of the OS services and the kernel running on everything else while those super-single-thread speed cores would be ready to take a specific load when that application is launched, and the OS scheduler would specifically have behavior for that to only let 1 or 2 threads on that core and that's it.

Modules: A-10C I/II, F/A-18C, Mig-21Bis, M-2000C, AJS-37, Spitfire LF Mk. IX, P-47, FC3, SC, CA, WW2AP, CE2. Terrains: NTTR, Normandy, Persian Gulf, Syria

 

Link to comment
Share on other sites

In the first screenshot you can see on the Details tab you can right-click on the tab header like CPU, and then choose Select columns.

 

On the second screenshot, I chose CPU time, handles, and threads.

 

At that point you can sort by the processes that have the most threads.

TM1.jpg.0159f1621fd8c269460f06f208657d9f.jpg

TM2.jpg.f61a9a3352d28579698214ee03860dae.jpg

Steve (Slick)

 

ThrustMaster T.Flight Hotas X | TrackIR5 Pro | EVGA GTX 1070 | Win10 64-bit Professional | Dell Precision 7920 Workstation | 1 TB SSD | 128 GB Memory | Dual Intel Xeon Platinum 2.0 GHz 16 Core Processors (64 Total w/HT ON) | 24" Dell Monitor

Link to comment
Share on other sites

 

Plus Vulkan API is just around the corner, and that would also be the best place to see some CPU optimizations which would make Vulkan API a bit pointless if the CPU would be left alone, this is actually correct, they don't want to waste resources on optimizing the older environment with the DX11 API.

 

I find this fascinating I have not heard that DCS was working on utilizing the existing Vulcan api's which work on all windows platforms at this time as well as Linux.

So could you be so kind as to post a link to the announcement by DCS about this?

Thanks in advance. Aairon

 

Never mind after a little search I found it.

 

https://forums.eagle.ru/showthread.php?t=202108&highlight=vulcan

 

Absolutely no time frame whatsoever so it could be a very...long..time.


Edited by aairon

Flying sims since 1980

 

[sIGPIC][/sIGPIC]

Mobo: Asus Z170 Pro Gaming

CPU: i7 6700K @ 4.7 GHz

Video: EVGA GTX 1080

Ram: Patriot DDR4 2800 8GBx2

PWR:Corsair RM750i

Link to comment
Share on other sites

Actually, I must correct myself, I had the trees on max, if I put them down to 20%, I get 90 FPS now if not much is going around, but with battles going on elsewhere.

 

Judging by the fact trees are now much denser, and actually affect gameplay, it was expected they would be a resource hog.

 

So with a fair comparison, my FPS is definitely much better, from a semi-stable 50, it's like 70 average, but it's also more stable. Just like others have reported, that's why I checked my settings.

 

I really think people should think that just because it's still 2 threads it's not an improvement, the term "optimizations" usually means you improve an existing system, and more threads would mean an "upgrade". Vulkan API should improve things even further even if you don't add more threads, but I'm sure they'll try something if it makes sense, separating audio and IO into their own threads might not make sense if it's not reaching 100% in most cases on recommended HW, but we'll see.

 

Oh btw, I never use any anti-aliasing, so MSAA was off in both cases.

Modules: A-10C I/II, F/A-18C, Mig-21Bis, M-2000C, AJS-37, Spitfire LF Mk. IX, P-47, FC3, SC, CA, WW2AP, CE2. Terrains: NTTR, Normandy, Persian Gulf, Syria

 

Link to comment
Share on other sites

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...