Package parser

Class ParserEngine


  • public class ParserEngine
    extends java.lang.Object
    Parses input files and converts them to the formats used in the project.
    • Constructor Summary

      Constructors 
      Constructor Description
      ParserEngine()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private java.util.Map<java.lang.String,​java.util.ArrayList<Station>> addStationCoordsToRouteStationsMapping​(java.util.Map<java.lang.String,​java.util.ArrayList<Station>> routesToStations, java.lang.String fileName)
      Parses stops.csv and extends the mapping routeId - List with GeoCoordinates of each station.
      private Map createMap​(java.util.Map<java.lang.String,​java.util.ArrayList<Station>> routeIdsToStations, java.util.Map<java.lang.String,​java.lang.String> routeIdsToLineNames)
      Creates a List of lines based on the mapping routesIdsToStations, extracts names of U-bahn lines based on their identifier and creates a Line for each U-bahn line.
      Map createMapFromBVGFiles()
      Creates a common.Map - a set of Ubahn lines (id, name, List) based on the data from mappings routes - trips, trips - stations, routeId - routeName
      private static <T> java.util.function.Predicate<T> distinctByKey​(java.util.function.Function<? super T,​?> keyExtractor)  
      private java.util.Map<Station,​java.util.ArrayList<java.lang.Long>> getDuplicatesStopsFromCSV()
      Parses stops.csv and creates a mapping of Station - List for stations with the same geo location.
      private java.lang.String getLineNameFromTripId​(java.lang.String tripId, java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> routeIdsToTrips, java.util.Map<java.lang.String,​java.lang.String> routesIdsToLineNames)
      Returns a name of an U-bahn line that serves a given trip
      private java.util.ArrayList<java.lang.Long> getListOfPossibleNextStops​(java.lang.Long stopId)
      Returns a List of stops adjacent (directly preceeding and suceeding) to stopId.
      java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> getStopsWithSchedule()
      If mapping routeIdsToLineNames doesn't exist, parses routes.csv and creates the mapping route Ids to Line names If mapping routesToTrips doesn't exist, parses routes.csv and creates the mapping route Ids to Trip ids Creates a mapping of stopIds to a list of their ScheduleItems (all trains leaving from the station)
      java.util.ArrayList<Station> getUBahnStations()
      From mappings routes - trips and trips - stations, create a mapping routes - stations.
      private java.util.Map<java.lang.String,​java.util.ArrayList<Station>> mapRoutesToStations​(java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> routesToTrips_dict, java.util.Map<java.lang.String,​java.util.ArrayList<Station>> tripsToStations_dict)
      Creates a mapping routeId - List based on the mappings routeId - List and tripId - List
      private java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> mapRoutesToTripsFromCSV​(java.util.Map<java.lang.String,​java.lang.String> routeIdToLineName, java.lang.String fileName)
      Parses the trips.csv and creates a mapping of routeId - List
      private java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> mapStopsToScheduleItems​(java.lang.String fileName, java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> routeIdsToTripIds, java.util.Map<java.lang.String,​java.lang.String> routesIdsToLineNames)
      Parses stop_times.csv and for each of the stops creates a List representing trains leaving from the station throughout the day.
      private java.util.Map<java.lang.String,​java.util.ArrayList<Station>> parseStationTimesFromCSV​(java.lang.String fileName)
      Parse stop_times.csv and creates a mapping tripId - ArrayList
      private java.util.Map<java.lang.String,​java.lang.String> readRoutesFromCSV​(java.lang.String fileName)
      Parses the routes.csv file and creates a mapping of routeId - routeName
      private java.util.Map<java.lang.String,​java.util.ArrayList<Station>> removeDuplicatesFromRoutesToStations​(java.util.Map<java.lang.String,​java.util.ArrayList<Station>> routesToStationsWithDuplicates)
      Changes Ids of duplicate stations to corrected fixed Ids for given stations.
      private java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> removeInvalidScheduleItems​(java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> stopsToScheduleItems)
      Removes Schedule items that lead to non-adjacent stations (not directly preceeding and directly suceeding) a given station.
      private java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> updateStopsToScheduleItemsMap​(java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> stopsToScheduleItems, java.lang.String[] nextRecord, java.lang.String[] nextNextRecord, java.lang.String lineName)
      Checks if nextRecord and nextNextRecord are on the same trip and if so then it creates a new schedule item for a stop from nextRecord to a stop from nextNextRecord.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • routeIdsToLineNames

        java.util.Map<java.lang.String,​java.lang.String> routeIdsToLineNames
      • routesToTrips

        java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> routesToTrips
      • stopsToScheduleItems

        java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> stopsToScheduleItems
      • routesToStations

        java.util.Map<java.lang.String,​java.util.ArrayList<Station>> routesToStations
      • tripsToStations

        java.util.Map<java.lang.String,​java.util.ArrayList<Station>> tripsToStations
      • stationsToDuplicateIds

        java.util.Map<Station,​java.util.ArrayList<java.lang.Long>> stationsToDuplicateIds
    • Constructor Detail

      • ParserEngine

        public ParserEngine()
    • Method Detail

      • getStopsWithSchedule

        public java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> getStopsWithSchedule()
        If mapping routeIdsToLineNames doesn't exist, parses routes.csv and creates the mapping route Ids to Line names If mapping routesToTrips doesn't exist, parses routes.csv and creates the mapping route Ids to Trip ids Creates a mapping of stopIds to a list of their ScheduleItems (all trains leaving from the station)
        Returns:
        Mapping stopId - ArrayList
      • getUBahnStations

        public java.util.ArrayList<Station> getUBahnStations()
        From mappings routes - trips and trips - stations, create a mapping routes - stations. Then use the mapping of routeId - routeName to extract line names. Finally, create a list of all Stations (line names used as parameters).
        Returns:
        List of Stations
      • createMapFromBVGFiles

        public Map createMapFromBVGFiles()
        Creates a common.Map - a set of Ubahn lines (id, name, List) based on the data from mappings routes - trips, trips - stations, routeId - routeName
        Returns:
        common.Map containing all Ubahn lines contained in the csv input files
      • removeDuplicatesFromRoutesToStations

        private java.util.Map<java.lang.String,​java.util.ArrayList<Station>> removeDuplicatesFromRoutesToStations​(java.util.Map<java.lang.String,​java.util.ArrayList<Station>> routesToStationsWithDuplicates)
        Changes Ids of duplicate stations to corrected fixed Ids for given stations. Removes duplicate stations for each line.
        Parameters:
        routesToStationsWithDuplicates -
        Returns:
        routesToStationsWithoutDuplicates
      • distinctByKey

        private static <T> java.util.function.Predicate<T> distinctByKey​(java.util.function.Function<? super T,​?> keyExtractor)
      • readRoutesFromCSV

        private java.util.Map<java.lang.String,​java.lang.String> readRoutesFromCSV​(java.lang.String fileName)
        Parses the routes.csv file and creates a mapping of routeId - routeName
        Parameters:
        fileName - input file "routes.csv"
        Returns:
        Mapping routeId -> UbahnLineName
      • mapRoutesToTripsFromCSV

        private java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> mapRoutesToTripsFromCSV​(java.util.Map<java.lang.String,​java.lang.String> routeIdToLineName,
                                                                                                                    java.lang.String fileName)
        Parses the trips.csv and creates a mapping of routeId - List
        Parameters:
        routeIdToLineName - Dictionary routeId -> UbahnLineName
        fileName - path to trips.csv
        Returns:
        Hashmap, key:routeId, value: ArrayList
      • parseStationTimesFromCSV

        private java.util.Map<java.lang.String,​java.util.ArrayList<Station>> parseStationTimesFromCSV​(java.lang.String fileName)
        Parse stop_times.csv and creates a mapping tripId - ArrayList
        Parameters:
        fileName - path to stop_times.csv
        Returns:
        mapping tripId - ArrayList
      • mapRoutesToStations

        private java.util.Map<java.lang.String,​java.util.ArrayList<Station>> mapRoutesToStations​(java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> routesToTrips_dict,
                                                                                                       java.util.Map<java.lang.String,​java.util.ArrayList<Station>> tripsToStations_dict)
        Creates a mapping routeId - List based on the mappings routeId - List and tripId - List
        Parameters:
        routesToTrips_dict - mapping routeId - List
        tripsToStations_dict - mapping tripId - List
        Returns:
        mapping routeId - List
      • addStationCoordsToRouteStationsMapping

        private java.util.Map<java.lang.String,​java.util.ArrayList<Station>> addStationCoordsToRouteStationsMapping​(java.util.Map<java.lang.String,​java.util.ArrayList<Station>> routesToStations,
                                                                                                                          java.lang.String fileName)
        Parses stops.csv and extends the mapping routeId - List with GeoCoordinates of each station.
        Parameters:
        routesToStations - mapping of routeId - List
        fileName - path to stops.csv
        Returns:
        mapping of routeId - List
      • createMap

        private Map createMap​(java.util.Map<java.lang.String,​java.util.ArrayList<Station>> routeIdsToStations,
                              java.util.Map<java.lang.String,​java.lang.String> routeIdsToLineNames)
        Creates a List of lines based on the mapping routesIdsToStations, extracts names of U-bahn lines based on their identifier and creates a Line for each U-bahn line. Then adds all lines to a Map and returns it.
        Parameters:
        routeIdsToStations - Mapping of routeIds to StationIds
        routeIdsToLineNames - Mapping of routeIds to route names
        Returns:
        Map object containing all U-bahn lines.
      • mapStopsToScheduleItems

        private java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> mapStopsToScheduleItems​(java.lang.String fileName,
                                                                                                              java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> routeIdsToTripIds,
                                                                                                              java.util.Map<java.lang.String,​java.lang.String> routesIdsToLineNames)
        Parses stop_times.csv and for each of the stops creates a List representing trains leaving from the station throughout the day.
        Parameters:
        fileName - path to stop_times.csv
        routeIdsToTripIds -
        routesIdsToLineNames -
        Returns:
        Map> for each of the stations from stop_times.csv
      • updateStopsToScheduleItemsMap

        private java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> updateStopsToScheduleItemsMap​(java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> stopsToScheduleItems,
                                                                                                                    java.lang.String[] nextRecord,
                                                                                                                    java.lang.String[] nextNextRecord,
                                                                                                                    java.lang.String lineName)
        Checks if nextRecord and nextNextRecord are on the same trip and if so then it creates a new schedule item for a stop from nextRecord to a stop from nextNextRecord. If any of the stopIds is a duplicate, it is replaced with an id of a fixed Station corresponding to the duplicate
        Parameters:
        stopsToScheduleItems -
        nextRecord - parsed from stops_times.csv
        nextNextRecord - parsed from stops_times.csv, proceeding nextRecord
        lineName - name of the Ubahn serving between nextRecord and nextNextRecord
        Returns:
        stopsToScheduleItems with a new ScheduleItem for station in nextRecord directed to the station in nextNextRecord
      • getLineNameFromTripId

        private java.lang.String getLineNameFromTripId​(java.lang.String tripId,
                                                       java.util.Map<java.lang.String,​java.util.ArrayList<java.lang.String>> routeIdsToTrips,
                                                       java.util.Map<java.lang.String,​java.lang.String> routesIdsToLineNames)
        Returns a name of an U-bahn line that serves a given trip
        Parameters:
        tripId -
        routeIdsToTrips - Mapping of routeIds to TripIds
        routesIdsToLineNames - Mapping of routeIds to line names
        Returns:
        String U-bahn name corresponding to trip_id
      • getDuplicatesStopsFromCSV

        private java.util.Map<Station,​java.util.ArrayList<java.lang.Long>> getDuplicatesStopsFromCSV()
        Parses stops.csv and creates a mapping of Station - List for stations with the same geo location. Stations with the same geo-location can be represented by multiple station instances in the stops.csv. For the mapping, it selects a single instance and maps it to a list of ids of all duplicate stations.
        Returns:
        mapping Station - List
      • removeInvalidScheduleItems

        private java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> removeInvalidScheduleItems​(java.util.Map<java.lang.Long,​java.util.ArrayList<ScheduleItem>> stopsToScheduleItems)
        Removes Schedule items that lead to non-adjacent stations (not directly preceeding and directly suceeding) a given station. Such items appear sporadically in the BVG schedule, we remove them to prevent errors in the Map/Graph which has only edges between adjacent stations.
        Parameters:
        stopsToScheduleItems -
        Returns:
        stopsToScheduleItemsWithoutInvalidItems
      • getListOfPossibleNextStops

        private java.util.ArrayList<java.lang.Long> getListOfPossibleNextStops​(java.lang.Long stopId)
        Returns a List of stops adjacent (directly preceeding and suceeding) to stopId. Note that a stop can be a part of more than one Ubahn line.
        Parameters:
        stopId - for which adjacent stops are looked for
        Returns:
        List adjacent to stopId