Classes vs structs

Editor
May 26, 2013 at 12:00 AM
Edited May 26, 2013 at 12:02 AM
Why are we using structs? In the other thread, Remfin said that best practice is to use classes, and I'm running into some roadblocks as a result of these being structs.

For instance, we want GwEvent to expose a GwWorld instead of a WorldId and a GwMap instead of a MapId, plus we have to resolve the event ID itself to a name. If this is a class, then we can do this:
        var response = ApiCall.GetContent<Dictionary<string, List<GwEvent>>>("events.json", arguments, ApiCall.Categories.World);

        foreach (GwEvent gwEvent in response["events"])
            gwEvent.ResolveIDs(this.wm);

        return response["events"];
In the GwEvent class, we have this:
    internal void ResolveIDs(WorldManager wm)
    {
        this.Map = wm.Maps[this.MapId];
        this.World = wm.Worlds[this.WorldId];
        this.Name = wm.Events.EventNames[this.EventId];
    }
So with that simple call, we're done! The class resolves all its own IDs automatically. If the class members change as the API changes, all we have to update is the implementation in the class itself. Nothing else has to change.

With structs, this doesn't work, because ResolveIDs() creates brand new objects by changing the values. This forces us to keep the implementation outside of the GwEvent object.

On top of that, we're unnecessarily creating lots of duplicate objects. The current code looks like this:
        var response = ApiCall.GetContent<Dictionary<string, List<GwEvent>>>("events.json", arguments, ApiCall.Categories.World);

        // Use Linq to match the event ids from the response with the nameResponse 
        // and put them in a list then return them.
        return (from eventsList in response.Values
                from events in eventsList
                from eventName in this.EventNames
                where events.EventId == eventName.Key
                select new GwEvent(events.WorldId, events.MapId, events.EventId, events.State, eventName.Value)).ToList();
So we get a list back from the API call, which is something like 1700 GwEvent objects. But then we create a brand new list of 1700 objects that has the names, instead of just populating the names on the existing objects. We have to do this, because they're structs instead of classes.

It's not the end of the world, but it's not exactly efficient either. What are we gaining by using structs?
Editor
May 26, 2013 at 12:14 AM
Keep in mind that when you retrieve all events for all worlds, you get 77,955 objects. So depending on what the caller is doing, this unnecessary duplication could become nontrivial.
Editor
May 26, 2013 at 4:15 AM
GwEvent now exposes World and Map objects to the user instead of WorldId and MapId. This work is done in EventData.cs. We're still duplicating the objects when this work is done, but we do the name at the same time, so it's no worse than it was before.
Editor
May 27, 2013 at 1:50 AM
So I did some memory usage analysis in both scenarios. In the first case I had it using structs as the code is currently checked in. In the second instance I had it using classes, and restructured the code like my example above where we call the object to resolve it's own IDs and don't create the duplicate objects.

Memory usage actually stays a lower when I use structs, but the code executes faster when I use classes. I ran a test of retrieving all events 5 times. With the memory analysis running, this took 360 seconds with structs versus 320 seconds with classes. However, with classes, Private Bytes peaked at 157 MB, whereas with structs, it peaked at only 107 MB. So I guess either way we go we're sacrificing something.
Coordinator
May 27, 2013 at 10:37 AM
Could you do the test again with mock data, so we can exclude delays from the ANet server? Just take the Json string and hard-code it into the GetJsonString method in the ApiCall class. At best you do this with a list of events that are not filtered with a server_id.
Editor
May 27, 2013 at 6:39 PM
Alright, here's what I did.

I grabbed the full events.json response containing 80,000 rows representing all events on all servers, and I saved that into the solution. I added a new class called GwEventClass that contained the class-based implementation so that I don't have to keep switching the code back and forth. Then I added two new methods to EventData.cs:
    /// <summary>
    /// Test code. Do not use.
    /// </summary>
    /// <returns></returns>
    public IEnumerable<GwEvent> GetStructEventsFromFile()
    {
        var reader = new System.IO.StreamReader(
            System.IO.Path.GetDirectoryName(
            System.Reflection.Assembly.GetAssembly(this.GetType()).Location) + "/events.json");
        var eventsResponse = Newtonsoft.Json.JsonConvert.DeserializeObject<Dictionary<string, List<GwEvent>>>(reader.ReadToEnd());
        reader.Close();

        // Turn the API events into events with names and return them
        return (from variable in eventsResponse.Values
                from apiEvent in variable
                select this.GetResolvedEvent(apiEvent)).ToList();
    }

    /// <summary>
    /// Test code. Do not use.
    /// </summary>
    /// <returns></returns>
    public IEnumerable<GwEventClass> GetClassEventsFromFile()
    {
        var reader = new System.IO.StreamReader(
            System.IO.Path.GetDirectoryName(
            System.Reflection.Assembly.GetAssembly(this.GetType()).Location) + "/events.json");
        var eventsResponse = Newtonsoft.Json.JsonConvert.DeserializeObject<Dictionary<string, List<GwEventClass>>>(reader.ReadToEnd());
        reader.Close();

        foreach (var gwEventClass in eventsResponse["events"])
            gwEventClass.ResolveIDs(this.wm);

        return eventsResponse["events"];
    }
Then, I wrote a simple test app to call this 10 times:
    static void Main(string[] args)
    {
        GW2DotNET.V1.World.WorldManager wm = new GW2DotNET.V1.World.WorldManager();

        DateTime startTime = DateTime.Now;

        for (int x = 0; x < 10; x++)
        {
            wm.Events.GetStructEventsFromFile();
        }

        DateTime endTime = DateTime.Now;

        Console.WriteLine("Time: " + (endTime - startTime).TotalSeconds + " seconds.");
    }
I compiled it using "Any CPU" and "Release" build.

I ran it 5 times using GetClassEventsFromFile(), and then I ran it 5 times using GetStructEventsFromFile(). Results:

GetClassEventsFromFile:
Run 1: 10.07 seconds
Run 2: 9.89 seconds
Run 3: 9.92 seconds
Run 4: 9.87 seconds
Run 5: 9.95 seconds
Average: 9.94 seconds

GetStructEventsFromFile:
Run 1: 10.75 seconds
Run 2: 10.66 seconds
Run 3: 10.61 seconds
Run 4: 10.49 seconds
Run 5: 10.61 seconds
Average: 10.62 seconds

So by using a class where we can simply populate the missing values instead of instantiating a whole new set of objects, we shave 6.5% off the time it takes to do this. This difference becomes much more pronounced with Debug builds, but hopefully no one is using those in production. The file is stored on an SSD in my case, so disk latency should be minimal. I suppose I could hard-code the string into the code itself, but, meh.

Running the memory profiler now...
Coordinator
May 27, 2013 at 8:45 PM
The bigger difference in Debug builds has something to do "JIT-ing" (look here for another example). but yeah, users should use the release build and if we release binaries we ahve to make sure we change the settings from debug to release.
Editor
May 27, 2013 at 9:34 PM
With the release build, I'm not seeing any significant differences in memory usage. I guess that had something to do with the Debug build.

So it appears that from a performance perspective, there is no downside to making GwEvent a class.

However, if we do switch it to a class we should probably also switch GwMap and GwWorld, not to mention anything else we return, just for consistency of behavior. Otherwise, we can leave GwEvent as a struct and just pay the performance cost.

The other place a class is going to be useful is the match_details response. That's another big, complicated object that's going to need multiple fields populated after deserialization. I've only just started playing with item_details, but it looks like that might be another one.
Coordinator
May 27, 2013 at 10:10 PM
item details and match details is a huge pain in the ass. If you look at the shelvesets (fyi: If you shelve code it is "comitted" but not integrated, meaning you can upload the code to the server but the changes are not publicly available, will do a tutorial later) under Team Explorer -> Pending Changes -> Actions -> Find Shelvesets you'll find a basic implementation of the WvW model and just now it is horrible. Well I did throw it together in about two hours, nonetheless, horrible!

I'll do a more throughout research if we should use classes or structs, for now keep the structs. But change the return types to IEnumerable<T>
Coordinator
May 28, 2013 at 10:36 AM
Edited Jun 5, 2013 at 9:59 PM
Also bear in mind that we can change the Ids directly in the struct. We just have to define the appropriate method.


Taking your code and changing it slightly:
var response = ApiCall.GetContent<Dictionary<string, List<GwEvent>>>("events.json", arguments, ApiCall.Categories.World);

List<GwEvent> changedEvents = new List<GwEvent>(); 

foreach (GwEvent gwEvent in response["events"])
    changedEvetns.Add(gwEvent.ResolveIDs(this.wm));

return changedEvents;
In the GwEvent class, we then have this:
internal GwEvent ResolveIDs(WorldManager wm)
{
    GwEvent newEvent = this;    
    
    newEvent.Map = wm.Maps[this.MapId];
    newEvent.World = wm.Worlds[this.WorldId];
    newEvent.Name = wm.Events.EventNames[this.EventId];
}
Yes we have to create a new list and add the changed values to that, however I'm pretty sure there is a neater solution. I only typed this from memory in the codeplex site, not in VS, but you should get the gist.
Coordinator
May 28, 2013 at 8:48 PM
Ok. I implemented the change and everything runs fine so far. Please look over it bilong and look if you find any obvious errors.

I deleted the get singleEvent on purpose because it is too much hassle and as soon as you want to get more than two or three events getting the whole bunch and then filtering via linq is almost as fast, if not faster.