Advent of Code 2020, Day 4 -- Passport Processing

Dec 4th, 2020

6 min read

Advent Of Code

Python solutions to Advent of Code 2020, day 4.

My Advent of Code solutions, done in Python. Refactors and solutions in other languages will be added if/when they're done.

Full code for Day 4 can be found on Github.

You can participate in Advent of Code at adventofcode.com/2020

The Day 4 problem can be found at adventofcode.com/2020/day/4

Solution, part 1

Oof, starting to get a little bit more difficult today. First, what I want to do is take the jumble of lines and data chunks and turn it into a nice dictionary object. I want to end up with a list of dictionaries so I can iterate over each passport object and see if they have the right data.

You'll see if len(split_data) < 2: in there. That was my way of testing for a newline. Since the new lines that were split out don't have ":", there's only one string in the list instead of two. A key-value pair will be structured [key: value], and that's used to create the passport object below the if statement. In the case of a newline, I'm appending the passport object to the arr list and replacing passport with an empty dictionary object. Not the cleanest way of doing it, but it works.

You'll also see my comment to myself # Don't forget the final passport! because I totally forgot to append the final passport! I was ending up with a value that was one less because the last passport wasn't appended since there was no newline after it!

# The usual text file parsing.
puzzle_input = open("input.txt", "r")
lines = puzzle_input.readlines()
lines = [line.strip() for line in lines]

# Split the text file data chunks, split those by the delimiter ":",
# then put those into the dictionary and push to a list when a
# newline is found.
def passport_processor(passports):
    arr = []
    passport = {}
    for line in passports:
        split_line = line.split(" ")
        for l in split_line:
            split_data = l.split(":")
            if len(split_data) < 2:
                arr.append(passport)
                passport = {}
                break
            passport[split_data[0]] = split_data[1]
    arr.append(passport)  # Don't forget the final passport!
    return arr

Now that there's a nice data structure to work with, I can go about checking the passports for the required data. In this case, I simply made a bunch of if statements that respond with "fail" if the key isn't in the passport dictionary. You'll notice that cid is at the top and will always respond with "pass", but the rest of the keys will fail if any are missing.

# Check individual passports for pass/fail based on dict keys.
def passport_checker(passport):
    pass_check = "pass"
    if 'cid' not in passport:
        pass_check = "pass"
    if 'byr' not in passport:
        pass_check = "fail"
    if 'iyr' not in passport:
        pass_check = "fail"
    if 'eyr' not in passport:
        pass_check = "fail"
    if 'hgt' not in passport:
        pass_check = "fail"
    if 'hcl' not in passport:
        pass_check = "fail"
    if 'ecl' not in passport:
        pass_check = "fail"
    if 'pid' not in passport:
        pass_check = "fail"
    return pass_check

Since passport_checker tests each passport individually, I need one more function to map over the passports in the list provided by passport_processor and get the number of passing passports.

I'm using a list comprehension here that runs passport_checker over the list returned by passport_processor and counts the number of times that "pass" appears in the list.

# Run through all passports in the file
# and report back the number of "pass" passports.
def check_all_passports(passports):
    checked_passports = [passport_checker(p) for p in passport_processor(passports)]
    return checked_passports.count("pass")


print(check_all_passports(lines))

Solution, part 2

I'm going to mostly leave passport_checker alone and build a new function to do the further validation. What I do need to do, though, is pass the passport data along with the pass/fail determination, so I can take the passports that pass the first check and run the validation on them.

To do this, I'm going to return a tuple from passport_checker instead of a string. The tuple will return the pass/fail string and the passport data.

# Check individual passports for pass/fail based on dict keys.
def passport_checker(passport):
    pass_check = "pass"
    if 'cid' not in passport:
        pass_check = "pass"
    if 'byr' not in passport:
        pass_check = "fail"
    if 'iyr' not in passport:
        pass_check = "fail"
    if 'eyr' not in passport:
        pass_check = "fail"
    if 'hgt' not in passport:
        pass_check = "fail"
    if 'hcl' not in passport:
        pass_check = "fail"
    if 'ecl' not in passport:
        pass_check = "fail"
    if 'pid' not in passport:
        pass_check = "fail"
    return pass_check, passport  # Added the tuple here

Next, I need to write all of the validation checks for each value. I decided to use if statements again, looking for whether the value does not fit the parameters. If it doesn't, it immediately returns a fail message instead of moving on to the next validation check.

For the three values that are numerical (byr, iyr, and eyr), I first did checks to make sure that they are decimal numbers, since trying to create an integer out of a string with a non-numerical character would fail. Then, I did the checks for whether or not the numbers fell within the specified ranges.

Checking for height (hgt), I am checking to see if "cm" or "in" appears in the string, then I use some regex to get the number from the string and do a range check for those as well.

Hair color (hcl) can be any valid hexadecimal color, so I decided to use regex again to check for a string that begins with "#" and is followed by a six-character string containing a-f and 0-9.

For eye colors (ecl), I created a list of the possible eye colors and matched that to the string provided.

Finally, for passport ID (pid), I used regex again to check for a nine-digit number.

Note that I didn't even bother with cid in this case, since we always want that to pass.

# Validation statements
# Since the final function only counts "pass",
# I'm using fail1, fail2, etc for debugging purposes.
def passport_validation(passport):
    if not passport['byr'].isdecimal():
        return "fail1"
    if not 1920 <= int(passport['byr']) <= 2002:
        return "fail2"
    if not passport['iyr'].isdecimal():
        return "fail3"
    if not 2010 <= int(passport['iyr']) <= 2020:
        return "fail4"
    if not passport['eyr'].isdecimal():
        return "fail5"
    if not 2020 <= int(passport['eyr']) <= 2030:
        return "fail6"
    if ("cm" not in passport['hgt']) and ("in" not in passport['hgt']):
        return "fail7"
    if "cm" in passport['hgt']:
        hgt = re.sub('\D', '', passport['hgt'])
        if not 150 <= int(hgt) <= 193:
            return "fail8"
    if "in" in passport['hgt']:
        hgt = re.sub('\D', '', passport['hgt'])
        if not 59 <= int(hgt) <= 76:
            return "fail8"
    if not re.match('^#[a-f0-9]{6}$', passport['hcl']):
        return "fail9"
    eye_colors = ['amb', 'blu', 'brn', 'gry', 'grn', 'hzl', 'oth']
    eye_match = False
    for e in eye_colors:
        if passport['ecl'] == e:
            eye_match = True
            break
    if not eye_match:
        return "fail10"
    if not re.match('^[0-9]{9}$', passport['pid']):
        return "fail11"

    return "pass"

It's kind of a mess, but it works. As I say in the comment above, I used different return messages for fails so I could debug any problems with my logic. Since I'm only going to count the string "pass", I can return whatever I want when validation fails.

Now I need to update my check_all_passports function to run the second set of rules. I'm adding a second argument to the function that will trigger the additional validation. If I pass "True", it will run the additional validation.

Instead of taking the checked passports and running a count on them, I'm going to map over them with a filter function to return only passports that pass. Then, I'm running a map function to only return the passport data (y[1]). That way, I can pass a clean list of passport data into passport_validation.

If the validate argument is "True", there's another list comprehension that runs the passports through passport_valication, then it gets the count of "pass" in that list. Otherwise, it simply returns the length of passed.

# Run through all passports in the file
# and report back the number of "pass" passports.
def check_all_passports(passports, validate):
    checked_passports = [passport_checker(p) for p in passport_processor(passports)]
    passed = list(map(lambda y: y[1], (filter(lambda x: x[0] == "pass", checked_passports))))
    if validate:
        validated_passports = [passport_validation(p) for p in passed]
        return validated_passports.count("pass")
    else:
        return len(passed)


print(check_all_passports(lines, False))
print(check_all_passports(lines, True))

Refactoring

I very much want to refactor this to be more organized and "sane". It's not too much of a mess as-is, but I'd like the additional validation to be cleaner. While I don't have the time right now, hopefully I can go back and refactor while I'm on vacation next week! I'll add an update when I finish my refactor.

mary.codes