Cheat Engine Forum Index Cheat Engine
The Official Site of Cheat Engine
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


GPGPU Memory Scanner

 
Post new topic   Reply to topic    Cheat Engine Forum Index -> General programming
View previous topic :: View next topic  
Author Message
Slugsnack
Grandmaster Cheater Supreme
Reputation: 71

Joined: 24 Jan 2007
Posts: 1857

PostPosted: Thu Mar 17, 2011 6:45 am    Post subject: GPGPU Memory Scanner Reply with quote

Just thought I'd throw this idea out there since I recently picked up CUDA (and some OpenCL). I was wondering on people's thoughts on creating a memory scanner using this technology. The problem of searching memory (especially small sizes) is extremely parallel and lends itself to GPGPU. I am sure significant performance gains can be made but not sure if I'm bothered. This project would be more for fun and learning although it is more than possible it could result in a faster CE.

The process I imagine is fairly simple:
- Copy pages to be scanned from main memory to GPU
- Each SIMD engine runs a warp/wavefront which fills a bitmap representing matches
- GPU result copied back to CPU

I think thread divergence can be avoided at stage 2 by using XOR operators and such.. Not sure on that one though.

Any thoughts or input ? Or has anyone attempted a similar project before ? Suggestions ?
Back to top
View user's profile Send private message
hcavolsdsadgadsg
I'm a spammer
Reputation: 26

Joined: 11 Jun 2007
Posts: 5801

PostPosted: Thu Mar 17, 2011 4:07 pm    Post subject: Reply with quote

i'm not so sure this will lend itself to GPUs as well as you think.
Back to top
View user's profile Send private message
HomerSexual
Grandmaster Cheater Supreme
Reputation: 5

Joined: 03 Feb 2007
Posts: 1657

PostPosted: Thu Mar 17, 2011 6:18 pm    Post subject: Reply with quote

slovach wrote:
i'm not so sure this will lend itself to GPUs as well as you think.


Why not? GPUs are used for high performance computing all the time. (f@h for example)

_________________
Back to top
View user's profile Send private message
hcavolsdsadgadsg
I'm a spammer
Reputation: 26

Joined: 11 Jun 2007
Posts: 5801

PostPosted: Thu Mar 17, 2011 8:18 pm    Post subject: Reply with quote

it doesn't sound like a meaningful amount of work. if you could get everything scannable into memory in the right format but then all kinds of gotchas to avoid stalling, etc etc. it sounds like it could be branchy which is the last thing GPUs want to see. performance may be iffy without certain hardware support.

i don't know too much about it but it doesn't seem like a very applicable GPU problem. encoding videos and crunching math for folding proteins sounds sensible enough, this doesn't.
Back to top
View user's profile Send private message
atom0s
Moderator
Reputation: 205

Joined: 25 Jan 2006
Posts: 8585
Location: 127.0.0.1

PostPosted: Fri Mar 18, 2011 12:57 am    Post subject: Reply with quote

I could have sworn I've seen something like this before. I'll keep an eye out for it if I can find it again but this sounds really familiar.
_________________
- Retired.
Back to top
View user's profile Send private message Visit poster's website
Slugsnack
Grandmaster Cheater Supreme
Reputation: 71

Joined: 24 Jan 2007
Posts: 1857

PostPosted: Fri Mar 18, 2011 3:27 am    Post subject: Reply with quote

slovach wrote:
it doesn't sound like a meaningful amount of work. if you could get everything scannable into memory in the right format but then all kinds of gotchas to avoid stalling, etc etc. it sounds like it could be branchy which is the last thing GPUs want to see. performance may be iffy without certain hardware support.

i don't know too much about it but it doesn't seem like a very applicable GPU problem. encoding videos and crunching math for folding proteins sounds sensible enough, this doesn't.

the point of gpu processing is that you don't care about stalls because there are so many threads that instead of stalling the gpu simply switches to another one.

i think for shorter scans there is no point but there are times when ce would take minutes to scan particular somethings. don't remember which cases since i haven't used ce for a long time but i think it's something like scanning unknown initial. then scanning increased or something like that
HomerSexual wrote:
slovach wrote:
i'm not so sure this will lend itself to GPUs as well as you think.


Why not? GPUs are used for high performance computing all the time. (f@h for example)

gpu programming is not really for high performance computation as such. there are only particular problems that it is useful for. the main thing to note about gpus is that the emphasis is not on the latency and speed of an individual thread (as is with cpu) but instead on the throughput of many many different threads.

gpu programming is extremely fast for processing lots of different things using the same algorithm. that is, to run the same code through many threads with the only difference being the memory upon which they operate. that is why gpu is great for things like bruteforcing as well. however as you can note gpgpu has a very specific problem set domain.
Back to top
View user's profile Send private message
hcavolsdsadgadsg
I'm a spammer
Reputation: 26

Joined: 11 Jun 2007
Posts: 5801

PostPosted: Fri Mar 18, 2011 11:46 am    Post subject: Reply with quote

you're right that there's a lot of latency on the GPU, and that the throughput is high but you can stall it all the same. the pipeline is long but the throughput masks the latency. it's one of the reasons why you have to batch work so aggressively, the GPU can't feed itself.

i think you can explicitly keep synchronization but in case of a stall its possible for other threads to finish before others. data dependency / memory accesses may be problematic.
Back to top
View user's profile Send private message
Slugsnack
Grandmaster Cheater Supreme
Reputation: 71

Joined: 24 Jan 2007
Posts: 1857

PostPosted: Sun Mar 20, 2011 6:44 am    Post subject: Reply with quote

i don't understand where you think there is a problem of data dependency in scanning memory which is all reading. you can not get conflicts (and hence stalls) from just reading and comparing.

of course you will get compulsory cache misses when the data is first copied to gpu memory before it reaches the caches but this is unavoidable either way. in regards to synchronization, i don't regard this as a problem. we have functions that block till the gpu completes such as cudathreadsynchronize()

maybe it is worth it to code a proof of concept comparison kernel since i think the main cause of the lengthy times of scans on ce is likely cache misses.
Back to top
View user's profile Send private message
Dark Byte
Site Admin
Reputation: 470

Joined: 09 May 2003
Posts: 25785
Location: The netherlands

PostPosted: Sun Mar 20, 2011 9:45 am    Post subject: Reply with quote

the speed in scanning is dependent on the speed of ram, how much ram you have, and the harddisk speed
If you don't have much ram (e.g less than 4GB or you're on a 32-bit os) scanning a full game will be affected by the paging system because windows will be busy paging in and out memory of the game almost constantly

Most of the time when scanning on an old system the time spent waiting for the harddisk is greater than the time spent on comparing the memory

also, it's a bad idea to use ce 5.X and older versions from an USB stick as they store the temp results on the folder it's running from

_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping
Back to top
View user's profile Send private message MSN Messenger
Slugsnack
Grandmaster Cheater Supreme
Reputation: 71

Joined: 24 Jan 2007
Posts: 1857

PostPosted: Sun Mar 20, 2011 5:14 pm    Post subject: Reply with quote

have you ever ran a profiler to determine what is the ratio of time spent computating compares and time to do memory loads ? if it is half, it may still be a worthwhile project
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Cheat Engine Forum Index -> General programming All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group

CE Wiki   IRC (#CEF)   Twitter
Third party websites