GMLscripts.com

Discuss and collaborate on GML scripts
Invert

You are not logged in.

#21 2011-04-24 23:21:29

xot
Administrator
Registered: 2007-08-18
Posts: 1,201

Re: CHALLENGE: Conway's Game of Life

icuurd12b42 wrote:

Oh oh oh, I could blend in an answer.... But I wont. I just think an answer is surfacing in my head.

2MiyadL.gif


Abusing forum power since 1986.

Offline

#22 2011-04-25 02:23:30

~Dannyboy~
~hoqhuue(|~
From: Melbourne, Australia
Registered: 2009-10-02
Posts: 21
Website

Re: CHALLENGE: Conway's Game of Life

Naive solutions for a 40x40 grid run around 1 FPS on my machine.

Just so I have a rough idea of where I'm at, how fast does this simple solution run on your machine?
https://docs.google.com/leaf?id=0Byhdbm … y=CNrEwNsE (EDIT: Look down.)
For anyone interested this is the algorithm, it seems to be the most obvious way to do it. I have a few ideas but I'm unsure as to how much they will improve it, I'll get back to you when (if) I've tried them.

/// Apply Rules of Life
{
    var next, sum, i, j;
    next = ds_grid_create(40, 40);
    ds_grid_copy(next, grid);
    
    surface_set_target(surface);
    draw_primitive_begin(pr_pointlist);
    
    for (j = 0; j < 40; j +=1) {
        for (i = 0; i < 40; i += 1) {
            sum = ds_grid_get_sum(grid, i - 1, j - 1, i + 1, j + 1);
            if (ds_grid_get(grid, i, j)) {
                // Currently alive
                if (sum < 3 or sum > 4) {
                    ds_grid_set(next, i, j, 0);
                    draw_vertex_color(i, j, 0, 1);
                }
            } else {
                // Currently dead
                if (sum == 3) {
                    ds_grid_set(next, i, j, 1);
                    draw_vertex_color(i, j, 16777215, 1);
                }
            }
        }
    }
    
    draw_primitive_end();
    surface_reset_target();
    
    ds_grid_destroy(grid);
    grid = next;
}

EDIT: Don't look if you don't want to know xot's method.

Spoiler

Oh oh oh, I could blend in an answer.... But I wont. I just think an answer is surfacing in my head.

lol, very subtle. I did consider that approach but I'm rather terrible with blend modes. It does make sense to move the work to the GPU though as it's a very parallel algorithm.

EDIT2: I noticed that I had forgotten to make i and j script local variables, fixing this gives a 4% fps boost on my machine... Why can't that be the default!
https://docs.google.com/leaf?id=0Byhdbm … y=CJaEwtIC

Last edited by ~Dannyboy~ (2011-04-25 03:21:37)

Offline

#23 2011-04-25 02:45:20

icuurd12b42
Member
Registered: 2008-12-11
Posts: 303

Re: CHALLENGE: Conway's Game of Life

~Dannyboy~ wrote:

Just so I have a rough idea of where I'm at, how fast does this simple solution run on your machine?

166 on mine

EDIT: Don't look if you don't want to know xot's method.

Spoiler

Oh oh oh, I could blend in an answer.... But I wont. I just think an answer is surfacing in my head.

lol, very suttle. I did consider that approach but I'm rather terrible with blend modes. It does make sense to move the work to the GPU though as it's a very parallel algorythm.

Cant wait to see it

Last edited by icuurd12b42 (2011-04-25 02:47:53)

Offline

#24 2011-04-25 12:42:44

xot
Administrator
Registered: 2007-08-18
Posts: 1,201

Re: CHALLENGE: Conway's Game of Life

~Dannyboy~ wrote:

Just so I have a rough idea of where I'm at, how fast does this simple solution run on your machine?

I'm getting a very respectable 115 fps.


Abusing forum power since 1986.

Offline

#25 2011-04-25 23:42:47

Manuel777
InvaderGames
Registered: 2011-04-25
Posts: 4
Website

Re: CHALLENGE: Conway's Game of Life

Here's my approach:
http://www.mediafire.com/?ujpml2pk20otxh2

I was able to pull 35fps on my pc, but im quite certain it can be boosted (just dont see how..)  i will try a different point of view later.. if i can come up with something smile

Edit: I havent realized my grid is actually 48x48..

Last edited by Manuel777 (2011-04-25 23:51:14)

Offline

#26 2011-04-26 04:24:38

~Dannyboy~
~hoqhuue(|~
From: Melbourne, Australia
Registered: 2009-10-02
Posts: 21
Website

Re: CHALLENGE: Conway's Game of Life

I was disappointed that I gained only a tiny 1% increase by reusing my temporary grid rather than creating and destroying each frame. On the other hand I was thrilled that keeping a cache of the number of alive neighbours each cell has gave me a whopping 20% boost!

https://docs.google.com/leaf?id=0Byhdbm … y=CIXquMwC

~260fps on my machine.

I think that's about as good as I can get by looping though every cell. I guess the next thing to try is to keep a set of cells which need to checked and only looping through those... I'd be surprised if I get that working any better, but I've been surprised before!



EDIT: Just for the fun of it, I made my engine run the 400x400 example. Not surprisingly it ran at around 2-3fps (1/100th the speed of the 40x40 example). Here it is, please excuse the half a minute of blank window as it reads in the image pixel by pixel tongue

https://docs.google.com/leaf?id=0Byhdbm … y=CKDUl68C

Last edited by ~Dannyboy~ (2011-04-26 06:21:09)

Offline

#27 2011-04-26 06:47:05

Manuel777
InvaderGames
Registered: 2011-04-25
Posts: 4
Website

Re: CHALLENGE: Conway's Game of Life

haha danny, your engine is just like mine, but hell'o faster! the difference is minimal smile

Offline

#28 2011-04-26 07:07:45

~Dannyboy~
~hoqhuue(|~
From: Melbourne, Australia
Registered: 2009-10-02
Posts: 21
Website

Re: CHALLENGE: Conway's Game of Life

haha danny, your engine is just like mine, but hell'o faster! the difference is minimal smile

Funny isn't it! Focusing on reducing the amount of work needed to be done in the parts of code that are run the most can make a huge difference!

I haven't used GM for quite a while, but I just remembered another quirk to abuse. The repeat loop is noticeably faster than a for loop when the length of the loop is known at the start (as the condition is checked in compiled code). +9%fps get!

https://docs.google.com/leaf?id=0Byhdbm … y=CM7h6_EC

EDIT: Terribly sorry about the spam with all my different versions, but I'm excited!

I noticed that as I was caching the number of living neighbours a cell had, I no longer needed two copies of the state of the grid, therefore I no longer needed to clone the grid, therefore I could use an array instead of a grid and therefore I gained another 20%fps boost!

https://docs.google.com/leaf?id=0Byhdbm … y=CNmx5dgB

Last edited by ~Dannyboy~ (2011-04-26 07:40:33)

Offline

#29 2011-04-26 10:07:52

xot
Administrator
Registered: 2007-08-18
Posts: 1,201

Re: CHALLENGE: Conway's Game of Life

Great work so far, everybody. I hoped this would be a good exercise in optimization. I'm really pleased to be seeing the different tricks people are trying, many of which I didn't anticipate.

And, ~Dannyboy~, you never have to apologize for posting interesting stuff here. wink


Abusing forum power since 1986.

Offline

#30 2011-04-26 19:57:13

superjoebob
Member
Registered: 2011-04-26
Posts: 1

Re: CHALLENGE: Conway's Game of Life

This looked like a lot of fun, so here's my entry so far.

www.wonthelp.info/superjoebob/GosperGun.zip

I'm getting around 220 FPS, what are you getting xot?

Offline

#31 2011-04-27 00:30:58

xot
Administrator
Registered: 2007-08-18
Posts: 1,201

Re: CHALLENGE: Conway's Game of Life

Welcome, superjoebob!

I'm getting around 117 fps which is very good for my machine. I'm amazed by how much faster it is for you.

Don't forget to post your editable version before Monday.


Abusing forum power since 1986.

Offline

#32 2011-04-28 13:27:23

OpticalLiam
Member
Registered: 2007-10-11
Posts: 8

Re: CHALLENGE: Conway's Game of Life

Here is my surface-based solution:

GMK, EXE

A 575 x 525 simulation is running at about 220fps on my machine. I have no idea how well this will work on other machines, it's pretty graphics memory and fill-rate heavy. It's also probably far from the best way of going about it, as I was only able to spend a few hours on it (due to my dissertation, damn you for telling me about this challenge, xot!). It was made in a pretty experimental way so I didn't really calculate anything properly. It could probably be made much more efficient by someone smarter than me!

Cheers.

Last edited by OpticalLiam (2011-04-28 13:57:48)

Offline

#33 2011-04-28 15:48:12

xot
Administrator
Registered: 2007-08-18
Posts: 1,201

Re: CHALLENGE: Conway's Game of Life

YES!!!

I'm so incredibly pleased someone else managed this. Check it out, folks, and let the mind-blowing begin. You are THE MAN, Liam!

Since this technique has now been posted, I'll post mine as well.

Download xot_life-blend.zip from Host-A

Like Liam, I tinkered until I got it working and haven't spent any time figuring out how to do it better. It seems to run a tiny bit faster than Liam's.

I do something naughty, which is draw a surface to itself. This an undefined operation as far as DirectX is concerned, and it may not work for everyone. It would be pretty easy to modify it to work within DirectX guidelines, but it would be a hair slower.

OK, now what was the significance of my earlier clue, and what was it meant to imply?

8dAeL5a.png

The clue in the image is the red, green, and blue channels which don't appear to have any relationship to each other. One particular case where red, green, and blue have no relationship is blending modes. They operate on each channel independently. The significance of this is that this CA engine is in some ways three times as fast as it appears; it can run three different patterns simultaneously (red, green, and blue) without any impact on performance. Press '7' in my demo to see this.

Although I've foregone improvements, I have spent some time working out (on paper) some other CAs running with this same technique. If I get them going, I'll post the results here.


Abusing forum power since 1986.

Offline

#34 2011-04-28 16:23:52

OpticalLiam
Member
Registered: 2007-10-11
Posts: 8

Re: CHALLENGE: Conway's Game of Life

Awesome, I compared xot's with mine and they work in almost exactly the same way overall, only his is a tiny bit faster (and uses one less surface, I was lazy with my reuse!).

xot wrote:

I do something naughty, which is draw a surface to itself.

Yeah, this technique is useful for quite a lot of other interesting things I've done in the past (maybe I'll post them around here soon).

xot wrote:

The significance of this is that this CA engine is in some ways three times as fast as it appears; it can run three different patterns simultaneously (red, green, and blue) without any impact on performance.

Aha, so that's what that was about, I didn't really think of that. A neat by-product of the method, though I'm not sure if it can be put to too much use (other than novelty).

xot wrote:

Although I've foregone improvements, I have spent some time working out (on paper) some other CAs running with this same technique. If I get them going, I'll post the results here.

I've also (briefly) thought about how these methods can be generalised to create other rules and other CAs, I might have a go at that too later on. In fact, during my experimentation I ended up with many awesomely weird CAs, some that spread like blobs and even some that generated mazes.

I still think it may be possible to reduce a lot of the drawing in our solutions, and get huge speed increases, anyone up to trying?

Last edited by OpticalLiam (2011-04-28 16:29:26)

Offline

#35 2011-04-28 19:09:17

~Dannyboy~
~hoqhuue(|~
From: Melbourne, Australia
Registered: 2009-10-02
Posts: 21
Website

Re: CHALLENGE: Conway's Game of Life

I had a feeling that was going to happen! Great work guys! At least I think I provided a good baseline so you can compare how much faster it is to a well optimised loop based approach. Maybe I'll try my hand at making a surface/blend one before I go looking through your code and spoil it, I think I have a rough idea of how to approach it... Maybe...

Offline

#36 2011-04-28 21:58:30

icuurd12b42
Member
Registered: 2008-12-11
Posts: 303

Re: CHALLENGE: Conway's Game of Life

This sort of reminds me of a few videos I saw on youtube about delegating work to the video card, even work that does not seem to be image related. Open CL stuff I think

Anyway, you guys will have to explain your code.

Offline

#37 2011-04-29 00:14:05

xot
Administrator
Registered: 2007-08-18
Posts: 1,201

Re: CHALLENGE: Conway's Game of Life

OK, I'll try to explain mine here. I'll show sections of code followed by an explanation of what is going on. For clarity and simplicity, throughout this post I'll often be treating colors as it there were only one color channel (sometimes called luminosity). If I say $08, I mean $080808. Since the color channels operate independently, the extra information seems redundant in this explanation. Any operation applied to one channel, is applied to them all individually.

Surfaces used:
    surf is the state of the CA engine at the start and end of the process. It is filled with pixels that are either dead ($00) or alive ($ff).
    sum, temp, temp2 are all used to manipulate the data

   

    surface_set_target(sum);
    draw_clear_alpha(c_black,1);
    draw_set_blend_mode_ext(bm_one,bm_one);
    draw_surface_ext(surf,-1,-1, 1,1,0,$010101,1);
    draw_surface_ext(surf, 0,-1, 1,1,0,$010101,1);
    draw_surface_ext(surf, 1,-1, 1,1,0,$010101,1);
    draw_surface_ext(surf,-1, 0, 1,1,0,$010101,1);
    draw_surface_ext(surf, 1, 0, 1,1,0,$010101,1);
    draw_surface_ext(surf,-1, 1, 1,1,0,$010101,1);
    draw_surface_ext(surf, 0, 1, 1,1,0,$010101,1);
    draw_surface_ext(surf, 1, 1, 1,1,0,$010101,1);

This first part stores the sum of all neighbors for all pixels into the "sum" surface. This done by drawing "surf" offset by one pixel in each of eight directions. Additive blending is used to preserve and accumulate the input of each pass. I use color blend $010101 to ensure that the range of sums is between ($00) and ($08) as you would expect in a conventional engine.

The real key to this method is the use of low-pass and high-pass filter masks. Basically, wherever pixels are darker (low-pass) or brighter (high-pass) than a given luminosity, the mask is white ($ff), elsewhere it is black ($00). The mask can then be used with a multiplicative blend to remove (ie. make black) the pixels we do not want while preserving the ones we do. When these filter masks are used in combination (low-pass mask * high-pass mask), they form a band-pass filter. In my source code, I get low-pass and high-pass terminology backwards. Mea culpa.

I'll use some data to help illustrate the process.
Sum:
$00 $01 $02 $03 $04 $05 $06 $07 $08

    lo = $010101;
    hi = $040404;

Sets the limits of our band-pass filter. Pixels of $02 or $03 luminosity will be preserved. You should recognize these as the survival thresholds.

    // Low Pass Mask - keep everything > lo
    surface_set_target(temp);
    draw_clear_alpha(c_white,1);
    draw_set_blend_mode_ext(bm_zero,bm_inv_src_color);
    draw_surface(sum,0,0);

First thing we do is use a temp surface to store the inverted sum signal.
Temp:
$ff $fe $fd $fc $fb $fa $f9 $f8 $f7

    draw_set_blend_mode_ext(bm_one,bm_one);
    draw_primitive_begin(pr_trianglefan);
    draw_vertex_color(0,0,lo,0);
    draw_vertex_color(w,0,lo,0);
    draw_vertex_color(w,h,lo,0);
    draw_vertex_color(0,h,lo,0);
    draw_primitive_end();

Next we add to the signal to saturate/overdrive it for pixels beyond the threshold of the filter.
Temp + lo:
$ff $ff $fe $fd $fc $fb $fa $f9 $f8

    draw_set_blend_mode_ext(bm_inv_dest_color,bm_zero);
    draw_rectangle_color(0,0,w,h,c_white,c_white,c_white,c_white,false);

Reinvert the signal.
Temp:
$00 $00 $01 $02 $03 $04 $05 $06 $07

    draw_set_blend_mode_ext(bm_one,bm_one);
    repeat (8) draw_surface(temp,0,0);

Use feedback to amplify the non-zero values in the signal until they are saturated/overdriven. Here is our first filter mask.
Temp:
$00 $00 $ff $ff $ff $ff $ff $ff $ff

    // Mask Image
    draw_set_blend_mode_ext(bm_zero,bm_src_color);
    surface_set_target(sum);
    draw_surface(temp,0,0);
    surface_reset_target();

Mask the sum signal with the filter mask (temp). Everything below $02 luminosity is filtered out.
Sum:
$00 $00 $02 $03 $04 $05 $06 $07 $08

This next section is a similar filter operation. I do some unintuitive things with alpha so that the correct alpha is produced at the end of the operation. Doing it this way saves a couple of steps. At least that was the idea. Later you'll see where some steps could have been omitted, so I'm not entirely certain about this alpha business anymore. This stuff was written weeks ago, and you know how quickly one can forget the intention of one's code.

    // High Pass Mask - keep everyting < hi
    hi = $ffffff - hi;
    surface_set_target(temp);
    draw_clear_alpha(c_white,0);
    draw_set_blend_mode_ext(bm_zero,bm_src_color);
    draw_surface(sum,0,0);

First thing we do is copy the filtered sum signal into a temp surface.
Temp:
$00 $00 $02 $03 $04 $05 $06 $07 $08

    draw_set_blend_mode_ext(bm_one,bm_one);
    draw_primitive_begin(pr_trianglefan);
    draw_vertex_color(0,0,hi,0);
    draw_vertex_color(w,0,hi,0);
    draw_vertex_color(w,h,hi,0);
    draw_vertex_color(0,h,hi,0);
    draw_primitive_end();

Overdrive the portion of the signal that is beyond the filter threshold.
Temp + ($ff - hi):
$fb $fb $fd $fe $ff $ff $ff $ff $ff

    draw_set_blend_mode_ext(bm_inv_dest_color,bm_zero);
    draw_rectangle_color(0,0,w,h,c_white,c_white,c_white,c_white,false);

Invert the signal.
Temp:
$04 $04 $02 $01 $00 $00 $00 $00 $00

    draw_set_blend_mode_ext(bm_one,bm_one);
    repeat (8) draw_surface(temp,0,0);

Use feedback to amplify the non-zero values in the signal.
Temp:
$ff $ff $ff $ff $00 $00 $00 $00 $00

    // Mask Image
    draw_set_blend_mode_ext(bm_zero,bm_src_color);
    surface_set_target(sum);
    draw_surface(temp,0,0);
    surface_reset_target();

Mask the filtered sum signal with the new filter mask (temp). Everything above $03 luminosity is filtered out.
Sum:
$00 $00 $02 $03 $00 $00 $00 $00 $00

    // Survival
    surface_set_target(temp);
    draw_clear_alpha(c_black,1);
    draw_set_blend_mode_ext(bm_one,bm_one);
    draw_surface(sum,0,0);
    repeat (8) draw_surface(temp,0,0);

Feedback the sum signal to create a band-pass filter mask stored in temp surface. It's possible masking the sum signal and doing the feedback loop could be skipped by simply masking the first mask with the second mask. That might require an additional surface.
Temp:
$00 $00 $ff $ff $00 $00 $00 $00 $00

    draw_set_blend_mode_ext(bm_dest_color,bm_zero);
    draw_surface(surf,0,0);
    surface_reset_target();

Filter the original input signal (surf) with the band-pass filter mask, storing it into temp. Pixels existing in the input signal will survive where their neighborhood sum is $02 or $03. All other pixels will become black.

   // Birth
    lo = $020202;
    // Low Pass Mask - keep everything > lo
    surface_set_target(temp2);
    draw_clear_alpha(c_white,1);
    draw_set_blend_mode_ext(bm_zero,bm_inv_src_color);
    draw_surface(sum,0,0);
    draw_set_blend_mode_ext(bm_one,bm_one);
    draw_primitive_begin(pr_trianglefan);
    draw_vertex_color(0,0,lo,0);
    draw_vertex_color(w,0,lo,0);
    draw_vertex_color(w,h,lo,0);
    draw_vertex_color(0,h,lo,0);
    draw_primitive_end();
    draw_set_blend_mode_ext(bm_inv_dest_color,bm_zero);
    draw_rectangle_color(0,0,w,h,c_white,c_white,c_white,c_white,false);
    draw_set_blend_mode_ext(bm_one,bm_one);
    repeat (8) draw_surface(temp2,0,0);

Here is another mask operation just like the first one, except we want to preserve pixels where the sum signal is $03 or more. Since the sum signal has already been filtered to remove anything above $03, all that will remain is a mask where the sum signal was exactly $03. You should recognize this as the birth condition.

    // Mask Image
    draw_set_blend_mode_ext(bm_zero,bm_src_color);
    surface_set_target(sum);
    draw_surface(temp2,0,0);
    surface_reset_target();

This masks the sum signal so that only pixles of $03 luminosity remain. This is a completely unnessary step and is probably a remnant of testing.

    surface_set_target(surf);
    draw_set_blend_mode_ext(bm_inv_dest_color,bm_inv_src_color);
    draw_primitive_begin(pr_trianglefan);
    draw_vertex_color(0,0,$ffffff,0);
    draw_vertex_color(w,0,$ffffff,0);
    draw_vertex_color(w,h,$ffffff,0);
    draw_vertex_color(0,h,$ffffff,0);
    draw_primitive_end();
    draw_set_blend_mode_ext(bm_zero,bm_src_color);
    draw_surface(temp2,0,0);
    draw_set_blend_mode_ext(bm_one,bm_one);
    draw_surface(temp,0,0);
    surface_reset_target();

Inverts the input signal (surf) so dead pixels are now white ($ff), then masks this with the band-pass filter mask created in the previous section (temp2). Wherever a dead pixel has $03 neighbors remains white, everywhere else is black. These pixels are merged (added) with the survival mask (temp) to form the new state (surf). It seems like this last section could be simplified by simply applying the birth mask (temp2) directly to the survival mask (temp) and skipping the manipulation of the input signal (surf).

It appears there are a few places where things could be omitted or combined for marginally better performance.


Abusing forum power since 1986.

Offline

#38 2011-04-29 05:12:54

~Dannyboy~
~hoqhuue(|~
From: Melbourne, Australia
Registered: 2009-10-02
Posts: 21
Website

Re: CHALLENGE: Conway's Game of Life

Wow! I think I would have got somewhere near there eventually but it's a lot more work than I expected (and I quickly gave up because I'm lazy tongue). I would have done a few parts differently.

1) Creating the neighbour surface: you're current solution uses 1 clear and 8 draws, my proposed solution uses 2 clears and 6 draws (which is faster in my mind...). It works by summing the surfaces horizontally first than summing the result vertically. Note that the sum will now contain the centre cell also, but that's not an issue if you account for it when picking your upper and lower bounds for living cells.

// xot's code
surface_set_target(sum);
draw_clear_alpha(c_black,1);
draw_set_blend_mode_ext(bm_one,bm_one);
draw_surface_ext(surf,-1,-1, 1,1,0,$010101,1);
draw_surface_ext(surf, 0,-1, 1,1,0,$010101,1);
draw_surface_ext(surf, 1,-1, 1,1,0,$010101,1);
draw_surface_ext(surf,-1, 0, 1,1,0,$010101,1);
draw_surface_ext(surf, 1, 0, 1,1,0,$010101,1);
draw_surface_ext(surf,-1, 1, 1,1,0,$010101,1);
draw_surface_ext(surf, 0, 1, 1,1,0,$010101,1);
draw_surface_ext(surf, 1, 1, 1,1,0,$010101,1);

// Suggested optimisation, maybe...
surface_set_target(sum_horizontal);
draw_clear_alpha(c_black,1);
draw_set_blend_mode_ext(bm_one,bm_one);
draw_surface_ext(surf,-1,0,1,1,0,$010101,1);
draw_surface_ext(surf, 0,0,1,1,0,$010101,1);
draw_surface_ext(surf, 1,0,1,1,0,$010101,1);
surface_set_target(sum);
draw_clear_alpha(c_black,1);
draw_surface(sum_horizontal,0,-1);
draw_surface(sum_horizontal,0, 0);
draw_surface(sum_horizontal,0, 1);

2) Filtering: Using the "removeback" feature of "background_create_from_surface" may be faster than your feedback method, I'm unsure whether the resource allocation will cost more or less than the drawing saved. I'm thinking something like:

  • use background_create_from_surface making all white into transparent

  • draw the background to a white surface with draw_background_ext(back, 0, 0, 1, 1, 0, c_black, 1)

  • now everything which wasn't white is black. I think (?)

3) I'm a little confused, dare I ask...

// Why this...
draw_primitive_begin(pr_trianglefan);
draw_vertex_color(0,0,lo,0);
draw_vertex_color(w,0,lo,0);
draw_vertex_color(w,h,lo,0);
draw_vertex_color(0,h,lo,0);
draw_primitive_end();

// ... instead of this?
draw_set_color(lo);
draw_set_alpha(0);
draw_rectangle(0, 0, w, h, false);

Seems like 3 extra function calls to me, and surely it's not faster (?), or more obvious as to what it's doing (?), or does something different (?).

Offline

#39 2011-04-29 11:11:19

xot
Administrator
Registered: 2007-08-18
Posts: 1,201

Re: CHALLENGE: Conway's Game of Life

1) That's an interesting way to do it. I've use similar methods to speed up strictly cosmetic blur filters. Also don't forget that changing render targets takes time too.

2) I'm pretty sure any of the functions that create a sprite or background from the screen or a surface are much, much slower than a few surface draws. I'd love to be wrong on this point.

3) I did it that way out of habit. When writing general purpose scripts, I try not tinker with the user's draw and alpha settings unless I have to. It's also a matter of control. I'm a bit prejudiced against the draw_rectangle function because of its history of inconsistency. I should probably give up my bigoted ways. All that said, I did make that change for an optimized version I was working on and it is faster. I ran into some problems elsewhere and abandoned the optimizations. I forgot I'd even done it until I read your post.

The process appears more complicated than it really is conceptually. I started this with low-pass, high-pass, and band-pass filters already constructed for other purposes. Having those tools available made for a very simple block diagram illustrating the process. Normally these filters would be in scripts, but to speed up the execution I copied them inline and cut out some redundancy and waste.


Abusing forum power since 1986.

Offline

#40 2011-04-29 15:58:33

OpticalLiam
Member
Registered: 2007-10-11
Posts: 8

Re: CHALLENGE: Conway's Game of Life

The code is seemingly convoluted because GM is so restrictive when it comes to image manipulation, and we have to use hacks to achieve simple filters such as thresholding.

Xot provided great commentary on his code, but I will give a shorter more abstract summary for those who are still puzzled. The method is actually quite simple, but clever - there are three key processes.

First is the neighbour accumulation, where the sums of the number of neighbours are calculated on a surface by performing a 2 x 2 box-blur. In GM we do this by additively drawing the surface 9 times at offsets [(-1,-1), (0,-1), (1,-1) ...].

Then for each of the rules of the game of life, we use threshold filters to remove pixels that are lower than a certain intensity.

In GM we do threshold filtering by first drawing a surface (here it's the neighbour accumulation surface) with a certain gray-value color  - the default is c_white, but we instead use some lower gray-value, which has the effect of adjusting the intensity of all pixels in preparation for the next stage. We then contrast-ramp the surface up by drawing the surface to its self many times (for our purposes we need 8 passes) with a special contrast blend mode, which has the effect of intensifying the pixels we want to keep (making them whiter) and reducing the ones we don't (making them blacker). So, the value of gray used before ramping essentially sets the threshold level that the ramping stage works on (and might tricky to choose, I just used experimentation).

In my implementation, each rule has its own surface, to store the result of the rule and requires other temporary surfaces (reusing others) to generate them.

The under-population rule just requires a single threshold pass and will be additive in the final stage.
The over-population rule requires a single threshold pass, and a mask, but will be subtractive.
The birth rule is a bit trickier, and requires two thresholds (lower and upper) merged together, and naturally the result is also additive.

Then the final stage is to correctly merge each rule together, which is adding and subtracting the above rule surfaces as required. This is done on the main surface, which is then fed as the seed in the next step.

Hope that clears it up a bit more!

Last edited by OpticalLiam (2011-04-29 16:18:51)

Offline

Board footer

Powered by FluxBB