GMLscripts.com

Discuss and collaborate on GML scripts
Invert

You are not logged in.

#1 2010-09-21 16:54:27

kalzme
Member
Registered: 2010-02-06
Posts: 7

CHALLENGE: Lossless Data Compression / Decompression

The goal would be to create 1 or more scripts in Game Maker capable of compressing data and afterwards being able te decompress it to the exact same data, meaning you will have to create a lossless compression algorithm. You will also have to keep in mind that speed is an important factor so the key to winning this challenge is finding a good balance between speed and compression ratio.

I will split this challenge into 2 rounds because I want the participants to be able to learn from each other while they can still change their solution. People that want to participate have to let me know by replying BEFORE the 27th of September because that's when ROUND 1 starts.

For ROUND 1 you will get 3 weeks (27 Sept - 18 Oct) but you can send your solution whenever you are ready.

After solutions have been sent I want the participants to discuss the solutions (how can the solutions be improved, why is a certain solution slow, etc.) This should be an interesting way for the participants to get new ideas and to learn from the others.

I WILL judge the solutions of ROUND 1(based on speed of compression, speed of decompression, compression rate and reliability)
BUT I WILL NOT show you your scores yet.

After ROUND 1 has ended (18 Oct), ROUND 2 will start IMMEDIATELY and you will get 1 week (18 Oct - 25 Oct) to improve your solution.

After that I will judge your second version and then I will anounce your total scores (ROUND 1: 40% | ROUND 2: 60%)

NOTE:
Be sure that your compression doesn't only compress good when it comes to plain text, because it will also be tested on files that don't contain plain text.

I wish you all the best of luck!

IF YOU FIND IT HARD/IMPOSSIBLE TO CREATE THIS WITHIN THIS TIMESPAN, PLEASE LET ME KNOW, I CAN STILL CHANGE THE STARTING/ENDING DATES.

Spoiler

I will not participate myself but I have already created a solution (even though it's not optimal yet).
I will probably post it after ROUND 1 so my solution can also be used/discussed/...

Last edited by kalzme (2010-09-21 17:02:05)

Offline

#2 2010-09-21 18:05:14

xot
Administrator
Registered: 2007-08-18
Posts: 1,240

Re: CHALLENGE: Lossless Data Compression / Decompression

I have a couple of questions:

The idea is to create scripts that can compress and restore arbitrary files (as opposed to some other data source)?

Do we post our solutions in-topic, or are we supposed to send them to you privately?


Abusing forum power since 1986.

Offline

#3 2010-09-21 18:16:32

kalzme
Member
Registered: 2010-02-06
Posts: 7

Re: CHALLENGE: Lossless Data Compression / Decompression

It should not be able to compress/decompress every file it is offered.
If you choose to create something that can only compress plain text then that's ok, but it will affect your score of course.

Well, my intension was that everyone could check out your file, learn from it, tell you what they would change, etc.
but if you don't want to publish the source, you can send it to me privately and post an executable here so others can see how you did.

Offline

#4 2010-09-21 20:09:10

jalb
Member
From: Texas
Registered: 2010-09-14
Posts: 12
Website

Re: CHALLENGE: Lossless Data Compression / Decompression

Sounds good, I've already got some ideas for this, so I'm in!

Offline

#5 2010-09-21 21:56:46

~Dannyboy~
~hoqhuue(|~
From: Melbourne, Australia
Registered: 2009-10-02
Posts: 21
Website

Re: CHALLENGE: Lossless Data Compression / Decompression

I'll give it a shot!

Last edited by ~Dannyboy~ (2010-09-21 21:57:04)

Offline

#6 2010-09-22 03:14:12

xot
Administrator
Registered: 2007-08-18
Posts: 1,240

Re: CHALLENGE: Lossless Data Compression / Decompression

I'm in.

By the way, if anyone has any future GMLscripts.com Challenge ideas they want to kick off, contact me via PM here or at GMC.


Abusing forum power since 1986.

Offline

#7 2010-09-23 08:10:08

jalb
Member
From: Texas
Registered: 2010-09-14
Posts: 12
Website

Re: CHALLENGE: Lossless Data Compression / Decompression

By the way, is all judging going to be done in one version of GM?  I'd suspect so since GM8 is faster than GM7 (by my tests), so GM8 makes sense.

Also, what exactly would be "good" compression time?  For a 4MB file, GM took about 18 seconds just to read it byte for byte with the file_bin functions... (will I get disqualified if I turn in a DLL instead? tongue)

Offline

#8 2010-09-23 08:29:07

kalzme
Member
Registered: 2010-02-06
Posts: 7

Re: CHALLENGE: Lossless Data Compression / Decompression

jalb wrote:

By the way, is all judging going to be done in one version of GM?  I'd suspect so since GM8 is faster than GM7 (by my tests), so GM8 makes sense.

I only have a registered copy of GM7 so judging will be done with GM7.
But since every file will be converted to a gmk of GM7, it wouldn't be unfair.

jalb wrote:

Also, what exactly would be "good" compression time?  For a 4MB file, GM took about 18 seconds just to read it byte for byte with the file_bin functions... (will I get disqualified if I turn in a DLL instead? tongue)

I know GM is slow with big files, long strings and big data structures, so I will not test with files that are beyond GM's capabilities.
I have a couple of files from the Caligary Corpus that I will probably use. The biggest file would be 768.771 bytes if I use those.
That should be doable AND it should give a good indication of how good your compression ratio and speed is.

You could split the data into chunks and perform your compression on every chunk.
This would lead to a worse compression ratio but it speeds things up.

BTW, my solution can't even compress a 50kB file, it's just too slow. So I try to solve that by splitting the string up, sacrificing compression ratio for speed.

Offline

#9 2010-09-23 09:42:50

xot
Administrator
Registered: 2007-08-18
Posts: 1,240

Re: CHALLENGE: Lossless Data Compression / Decompression

I've got an extremely fast and effective solution (very comparable to ZIP *big hint!*), but decoding 64KB of data is currently taking too long for my tastes (about 11 seconds). It also won't work in GM7 (*bigger hint!*), so this may have to be an exhibition-only entry. It also has some semi-unavoidable overhead making it unsuited for files smaller than a hundred bytes (*biggest hint!*), although that can be worked around at the expense of speed.

I may be able to adapt the method to GM7, but I haven't tried to yet. I specifically registered GM8 to do this, so you are in big, big trouble, kalzme. tongue

I think you should provide us with some sample files so that we can compare our results while we tweak our entries.


Abusing forum power since 1986.

Offline

#10 2010-09-23 10:25:13

kalzme
Member
Registered: 2010-02-06
Posts: 7

Re: CHALLENGE: Lossless Data Compression / Decompression

I was hoping that this wouldn't happen.

Well you can still post the GM8 and an executable so people can see how you did it, and I can test it with the executable (use get_open_filename then so I can use the files I have for the judging)

Problem solved?!

Offline

#11 2010-09-23 10:42:27

xot
Administrator
Registered: 2007-08-18
Posts: 1,240

Re: CHALLENGE: Lossless Data Compression / Decompression

Fair enough.


Abusing forum power since 1986.

Offline

#12 2010-09-23 20:17:01

~Dannyboy~
~hoqhuue(|~
From: Melbourne, Australia
Registered: 2009-10-02
Posts: 21
Website

Re: CHALLENGE: Lossless Data Compression / Decompression

I've got an extremely fast and effective solution (very comparable to ZIP *big hint!*), but decoding 64KB of data is currently taking too long for my tastes (about 11 seconds). It also won't work in GM7 (*bigger hint!*), so this may have to be an exhibition-only entry. It also has some semi-unavoidable overhead making it unsuited for files smaller than a hundred bytes (*biggest hint!*), although that can be worked around at the expense of speed.

That sounds an awful lot like my first (somewhat cheesy) idea... does you're solution use surface_save by any chance?

Because I thought that would be too easy, I think I'm going to have a shot at implementing Huffman coding in GML. I doubt my implementation will come close to the efficiency of DEFLATE (the compression used in PNG and gzip).

Offline

#13 2010-09-24 00:58:06

icuurd12b42
Member
Registered: 2008-12-11
Posts: 303

Re: CHALLENGE: Lossless Data Compression / Decompression

ah you sneaky devils... too bad gm has such a poor/slow method for grabbing the info.

apart from rle, all the other methods go as far over my head as a 747 in trans-atlantic flight.


A long time ago, I imagined a method where a huge database of symbols could be made (instead of dynamically generating them), like a gegabyte size database, holding to most commonly found character sequences of various sizes in files, where the largest entry would be say 256 chars and the smallest would be 8. Everyone would have this database and a compressed file would simply reference the entries from the database... using 3 bytes.

a segment would consist of a byte, 0 to 255, indicating the number of terms referenced, or a number of bytes (if 1st reference is 0), for strings not found in the database.
followed by a matching number of 3 byte references, or <000> + the actual uncompressed string.

the 3 bytes would allow for 16777215-1 entries in the datase (-1 because 0 indicates a string)

in this example, this string would look like that in the file, where as ThisSequenceHereIsNotInTheDatabase

<5><002><001><003><001><004> (1+15)
<1><000>, (1+3+1)
<17><001><003><001><005><001><006><001><007><001><008><001><009><001><002><001><00A><001><00B><001><00c><001><00D><001> (1+3*23)
<22><000>ThisSequenceHereIsNotInTheDatabase (1+3+34)

I split into multiple lines but it's sequential
each character in the <> are in hex, <> are just to separate visualy her, in () shows the number of bytes

database
1 <space>
2 in
3 this
4 example
5 string
6 would
7 look
8 like
9 that
A the
B file
C where
D as


Of course, the provided example yields a bigger file than the original. the database would need to hold entries larger in character lenght than the words I used for readability. Like multiple words...

To create the database, one would have to analise the content of a lot of PCs to figure out the most common entries

Last edited by icuurd12b42 (2010-09-24 01:03:30)

Offline

#14 2010-09-24 03:25:02

xot
Administrator
Registered: 2007-08-18
Posts: 1,240

Re: CHALLENGE: Lossless Data Compression / Decompression

~Dannnyboy~ hit the nail on the head, and yeah, it's cheesy, but I'm not above cheesiness. Like icuurd12b42 says, the big problem is getting the data back out using the pixel reading functions which are soooo slooow. I'm looking into implementing INFLATE with GML to see if it will be any faster.

I'm kind of in the same boat as icuurd12b42. I've implemented RLE in the past, but got lost when I tried to make a GIF decoder (LZW). The method icuurd12b42 is describing is actually very similar to LZ-style compression, except that the table in those are limited to just the data that are in the original file.

The RAR archiver, unlike ZIP, can using a single table for all files in the archive (called a solid archive) which usually yields greater compression and is similar to icuurd12b42's global table idea. On a per-game basis, generating a master table could be effective, but could also be slow to regenerate after any changes to the game.


Abusing forum power since 1986.

Offline

#15 2010-09-27 11:15:33

jalb
Member
From: Texas
Registered: 2010-09-14
Posts: 12
Website

Re: CHALLENGE: Lossless Data Compression / Decompression

Actually, count me out.  I might stick around but won't be submitting anything myself.

Really don't have the patience for this one.

Offline

#16 2010-10-16 14:24:47

kalzme
Member
Registered: 2010-02-06
Posts: 7

Re: CHALLENGE: Lossless Data Compression / Decompression

since the end of round 1 is near i thought i'd post my 'solution' already.
has anyone else finished yet or did we all forget about the challenge? xD


So here it is: Compression
it uses these techniques:

it's slow and it still has some flaws in it but i haven't really had the time to finish it.
ask questions, give comments, ...do whatever you like wink

Offline

#17 2010-10-19 07:13:07

~Dannyboy~
~hoqhuue(|~
From: Melbourne, Australia
Registered: 2009-10-02
Posts: 21
Website

Re: CHALLENGE: Lossless Data Compression / Decompression

Oops! Sorry! I caught the Minecraft bug... So addicting, so much time wasted, so many diamonds to be found!

Offline

#18 2010-10-19 07:15:49

kalzme
Member
Registered: 2010-02-06
Posts: 7

Re: CHALLENGE: Lossless Data Compression / Decompression

haha biggrin
no problem wink
how about we make this challenge permanent?
like whenever people want they can send something and i'll make up a scoreboard or something and you can update your entry anytime you want?

Offline

#19 2010-10-30 13:23:33

xot
Administrator
Registered: 2007-08-18
Posts: 1,240

Re: CHALLENGE: Lossless Data Compression / Decompression

I think that's probably a good idea. This is a deep subject.

I haven't had time to work on it beyond the first day. I'll follow up with what I've managed. I have not yet been able to implement DEFLATE/INFLATE in GML. Man, GMC is killing my productivity right now.


Abusing forum power since 1986.

Offline

#20 2010-10-30 13:35:00

kalzme
Member
Registered: 2010-02-06
Posts: 7

Re: CHALLENGE: Lossless Data Compression / Decompression

alright then
i'm looking forward to seeing your implementation

Offline

Board footer

Powered by FluxBB