Thursday, April 21, 2016

Network Analysis

Introduction

The goal of this project is to preform network analysis, in order to calculate the impact of trucks on local roads as they travel to and from mines to rail terminals in Western Wisconsin.

As one could imagine, the increased traffic of trucks moving from mines to railroad terminals has increased and speed up degradation on local roads in Western Wisconsin. Weather trucks are full of sand or empty, the routes from mines to rail terminals will be traversed many times per day, causing increased wear and tear on local infrastructure that may not have been designed for that level of traffic. Using the network analysis tools that are built into ArcMap will allow us to determine the fastest and most likely routes between the mines and the rail terminals. With this information, we can model and estimate the amount the increased traffic from trucking along specific routes, and then attempt to capture the true cost of the routes and upkeep which local municipalities will pay for.

One may think that the impacts of increased traffic may be negligible, however based on current industry analysis in the White Paper, Transportation Impacts of Frac Sand Mining in the MAFC region: Chippewa County Case Study, "At full build-out, the frac sand mining industry will be characterized by mining twenty-four hours a day, five days a week, heavy truck moves over rural roads, and unit or manifest trains moving approximately 40 million tons of sand a year..." (5), this report also predicts that the investment levels of frac sand mining predict a "20 to 30 year life span of the emerging frac sand industry" (5).

As you can see, the impact on the road system will not be a small one or of a short duration, with trucks moving back and forth across these roads almost constantly. Also noted by the White Paper is that, "Wisconsin serves as a model of how local government is using road use or road upgrade maintenance agreements (RUMA) to recover road damages, fund maintenance, and grade crossing improvement" (5). 

But we cannot just consider transportation of the frac sand, the construction of the mines, along with the hauling of heavy mining equipment, waste and other factors such as, "well construction, cement, steel pipes, rig infrastructure, as well as mobile offices are needed" (7),  and also contribute to road impacts due to truck transport. 

An example of the true impact of trucking predicts, "a conservative estimate of truck moves associated with a single well consist of 1.340 one-way truck trips to establish the well, or 2,680 round trip truck movements" (8). This White Paper also uses estimations of truckload number in various mine activates (Fig 1) and truck impacts on the local infrastructure based on type of truck movement (Fig 2).


Fig 1. Low and high estimates of typical number of truck loads to complete various activities involved with mining. These estimations are cited in the Transportation Impacts of Frac Sand Mining in the MAFC region: Chippewa County Case Study (8). See sources below.
Again, there is also the consideration of more than just frac sand transportation when it comes to truck transport. Each facility will require specific site equipment as well as infrastructure to keep the site running, in addition to the movement of replacement parts, and heavy mining equipment. Fig 2 illustrates some what these additional trips may be. 


Fig 2. Types of sand mining operations and transportation impacts based on the movement of trucks inside or outside of the mine.These estimations are cited in the Transportation Impacts of Frac Sand Mining in the MAFC region: Chippewa County Case Study (8). See sources below
With all of the different needs of the mine, as well as the movement of frac sand, one could imagine a substantial amount of wear and tear on local roads. Once the frac sand is trucked from a mine site, the sand is driven to a rail terminal, loaded on to a rail car, and sent to where the sand is needed to undergo the process of hydraulic fracing. As the map below illustrates this movement in Chippewa County. Rail terminals may only be accessed by frac sand trucks traversing State Highways and local roads (Fig 3).

Fig 3. Chippewa County Frac Sand Facilities. It is important to note that almost all facilities, rail or mine, are situated on the  a State Highway. Railways tend to be more centrally located near a city, while mines tend to be far away from that city. See sources below. 


In this analysis we will begin to look at what this impact may be, in terms of a hypothetical model, which will just focus on the impact of frac sand transportation. During the establishment of the mine, and while the mine is active, the amount of equipment which comes in compared to the amount of frac sand that goes out will be relatively minimal, as most of the heavy lifting of equipment will be when the mine opens and probably won’t be decommissioned or moved unless it is replaced or until the mine is inactive and shut down. When compared to the amount of trucks moving frac sand, which as stated previously, never stops and occurs at a rate that is much greater than the movement of equipment, and thus will be more impact-full on local infrastructure, and easier to demonstrate.
 
It is important to note that the our analysis of trips and cost will be one that is hypothetical and does not reflect real world data, but rather the process of building a project, and undergoing the analysis of this set of information, which can then be used with real world data and examples. 

Methods

For our hypothetical model, we will be using the data that we have gathered over the semester (published in previous blog posts) as well as a new analysis we have not previously been exposed to, Network analysis.

For our model we will be using the following information: That the trucks transport frac sand from the mines to the terminals 50 times per year (100 times total; round trip), and that the cost of the impact of the trucks on the road is $0.022 cents per mile.

In using network analysis we can use ArcMap to calculate the routes that would most likely be used by the frac sand mining industry to transport frac sand from mine to rail terminal. This essential works on the premise that the most direct routes are the most cost effective, and will be the routes taken by the trucks when the frac sand is moved.

In order to automate the process and undertake the analysis faster, we developed this project in two parts: first to write a python script which would select the mines in Western Wisconsin of Interest, and second to use Network analysis and model builder to automate the process of determining truck routes.

With both the python script and the model builder, new sites could be added to this analysis to reflect the change in mines over time, and offer a continued analysis with new information. Additionally, this analysis could be shared and validated by sharing either the python script or the model for someone else to undertake their own analysis.


For the first portion of this project we developed a python script that would….
  1. Select facilities that were in active status 
  2. Select from the list of total facilities, the sties that were actually mines
  3.  Create a feature class for facilities that were both active and mines which we could use for our analysis
  4. Select mines that are not within 1.5 km of their own rail terminal

For a picture of the actual script that was used, please see the page Python Code

The first few steps of the script were to narrow down the facilities to mines that were in active status from a larger list of facilities, create a feature class of those facilities, which would then be used in our later analysis.

The second part of the python code, was necessary for the reason that we are interested in mines that are not within 1.5 km of a mine, due to the fact that these sites have developed their own rail spur. Having a rail spur allows those mine locations to load frac sand directly on to rail cars, thus not needing to use trucks to transport the fracs sand off sight, and thus do not impact local road infrastructure, so we have taken them out of our model. 

Once our sites were chosen, we added a network layer of streets to ArcMap. Again the network layer allows us to determine the route locations from mines to rail terminals using the Make Closest Facility tool.

To do this step of the analysis we used model builder (Fig 4) which runs all the tools added to the model in sequentially, and much faster than using individual tools. 
Fig 4. Model Builder of the Network Analysis for Part 2. 

The first step in the model builder was select all of the Rail facilities that had 'rail' in their listed type, so that we would know that this set of facilities was indeed a rail terminal, and then we made a feature class of those Rail facilities. 

Then we began employing network analysis, by using the create closest facilities layer. In order to use the 'Make Closest Facilities Tool, we have to use the 'Add Layer Tool' twice, once for the facilities (rail terminals) and ounce for the incidences (mines). Then we can use the Solve Tool to get a solved routes feature class. By projecting the Routes feature class into UTM WGS84 Zone 15N, we can turn the linear distance of the feature to a unit that is more easily converted to miles, rather than using Decimal Degrees.

By intersecting that projected routes feature class with a counties feature class, we could then run the Summary Statistics Tool and determine the actual number for route length. 

At this point, the output of the Summary Statistics Tool created a table which gave us the information on the summed meters that traversed in each county by trucks. From this information we added a field to the table, using the Add Field Tool in model builder, which converted that distance in meters to miles. We then added two fields which similarly, calculated the cost of the miles traversed in the county, and then how much wear and tear over the course of the year the trucks traveling would actually amount too in dollars. The equation that we used via the Calculate Field tool was...
  • Cost of Travel = Route Miles* Cost to Infrastructure * Truck Trips per year * 2 (round trip!).
  • or: Cost of Travel = Route Miles * 0.022 * 50 *2
 The table is shown below (Fig 6)
Fig 6. Table with added fields for conversion of km to miles and calculation of cost and total costs per county.

Results and discussion

While the table in ArcMap works and has all of the information, cleaning up the table and using it to  make some graphs will help us to understand how local infrastructure is being impacted. To clean up the table the information was copped from ArcMap to excel (Fig 7).

Fig 7. Copied table from fig 6. which was entered and cleaned. 
After creating the table in excel, graphs were employed to visualize the data for a more thought out understanding of what the calculations done in model builder were actually telling us. 

In the Frequency of Facilities Per County Graph we can see many counties have between 0-10 mining facilities, while a few counties have under 5 and a few counties have more than 10. The counties with the most facilitates are Barron, Chippewa, Trempealeau and Wood. 


Fig 8.  Frequency of Facilities Per County, interestingly most counties have under 10 Faculties, which include both mines and rail terminals, while Wood and Barron Counties are in between 10 and 20, and Chippewa and Trempealeau counties have the most facilities with Trempealeau County having almost 1.5 times that of its closest county, Chippewa.  
 From the Total Route Miles Per County graph we can see that with the increased number of facilities in each county, there is an increased number of miles in the same counties that have the most facilitates from fig 7.  But oddly, the counties with more facilities, while all else equal have more route miles, than counties with less facilities, the increase in facilities does not mean that their is a positive correlation between facility number and route miles traveled per county. Specifically this is exemplified by the Chippewa and Trempealeau county which are both high facility counties, but have opposite route miles as their figure 7 counter parts.

Fig 9. Total Route Miles Per County. Here we can again see a similarly trend of Chippewa and Trempealeau counties having a higher number of route miles per county, but interestingly Trempealeau County has about half of the route miles as Chippewa County, which tells us that while Trempealeau has more facilities, these facilities are located much closer to rail terminals than in Chippewa County. We can also see Baron and Wood County with a higher number of route miles.  

 However as compared with the route miles to facility frequency graph, the cost of truck trips per year per county graph drive homes the point, that the more route length you have in the county, the higher the cost of impact that the frac sand industry will have on local roads in terms of dollars.

Fig 10. Cost of Truck Trips Per Year, following a similar trend as the route miles per county graph (Fig 8), we can see that the county impacted most by frac sand mining, in terms of road wear and tear predicted by our model is Chippewa County. We also see Trempealeau, Barron, and Wood Counties also have higher costs than the other counties in Western Wisconsin.  
Fig 11. Map of Trucking Impacts on Wisconsin Counties. The counties in green are low impact counties and will thus have the lowest wear and tear costs on roads, while counties in red have the highest wear and tear and thus the highest cost. 

As we can see from the graphs (Fig 8,9,10), the number of facilitates that a county has does not directly mean that their will be an increased cost associated with road maintenance from transportation of frac sand. It is perhaps more important to realize that the location of the rail terminal dictates more damage, such as the case in Trempealeau county, we can see that a centrally located rail terminal limits the amount of road use. The take away from this is really about planning, if a county is planning on expanding or creating mining operations and transportation to rail terminals of the frac sand mining industry, locating the rail terminal will limit the costs of maintaining roads in the counties which the trucks frequent.

With the addition of the network analysis map, we can visually highlight which counties specifically have the highest costs, and the placement of mines relative to the rail terminals. Both the graphs and the map reinforce each other in terms of their analysis and really drive home the point of spatia distribution of the mines and terminals, and not facility number as the main qualifier of road cost. 

Conclusions

This exercise serves as an example of the power of solving geospatial phenomenon using ArcMap. The ability of the user to understand ArcMap functions is paramount in solving real world problems.

Again, this is still a hypothetical model of potential impacts, that serves as a demonstration of only a portion of the traffic that these roads receive annually from the mining industry. But even with this limited information we have the ability to use ArcMap as a tool to understand the impacts of future decisions and evaluate future plans.

The other outcome of this analysis is in the road upgrade maintenance agreements (RUMA), between counties and mines to recover cost of road damage while the mine is in operation. Knowing how the rail terminals in each county are located can help the counties accurately conclude a correct contract with the mining companies to recover the true cost of damage caused to county roads.

Additionally we can also use this model to determine where a new terminal could be placed that would limit the amount of damage caused to roads, which would make both the mining companies happy, by decreasing their expenses due to the RUMA, and the city happy because they are not having to pay as much for road repair.



Sources:

  1. http://midamericafreight.org/wp-content/uploads/FracSandWhitePaperDRAFT.pdf
  2. ESRI street map USA is the source for the Network Dataset.
  3. The number of truck trips and cost of truck traffic on county roads was provided by Dr. Hupy

Friday, April 8, 2016

Data Normalization, Geocoding, and Error Assessment

Goals and objectives

            The goal of this assignment was to geocode address from a table provided by the Wisconsin DNR to locate frac sand mines and frac sand transportation facilities on a map of Wisconsin. Then after geocoding those sites, to compare the results of geocoding to actual site location data, analyzing any discrepancies when between the data sets. This may sound like an easy task, however it proved to be quite the opposite.

Methods

The information provided by the Wisconsin DNR was not as robust as one would hope. The facility information which was recorded had captured the basic naming and property status of each facility, in addition to having the site type and whether the site was active, but the location information specifically the site address was in shambles. Some of the locations were recorded in PLSS notation, while other sites had incomplete or missing information in their addresses, and some did not have any address information at all.

 
Fig 1. Example of information given to us from the Wisconsin DNR, the Address Column (yellow), shows the jumble of information formats that site addresses came in. 




So the first step naturally became to normalize the data in the table. In order to geocode the sites complete address information was required. Research was done to find that information by attempting to find the site locations and subsequent addresses by searching the PLSS records via the PLSS finder on the Wisconsin State Cartographer’s Office Website (See references) . Once the correct PLSS was located, the correct individual sub section which narrows down the zone in which the site would be located.

Fig 2. PLSS Finder via the Wisconsin State Cartographers Office. The Information of the zoomed in PLSS zone is listed under the Township/Range/Section Search, on the left hand side of the picture. 


After to attempting to locate some of the sights which did have actual address information, more research had to be done to relocate these sights, as the address was not specific or accurate enough to geocode the site correctly. These geocoded addresses usually were associated with a road that was a highway or county road which could have multiple names . When put into the geocoding tool in ArcMap the correct road was found but the address was not geoloacted in the correct place.

Once sites were located geocoded, we then received a shapefile from other students who had geocoded the same mines and facilities. We also received a shapefile with the actual locations of the mines. After running the "Near Tool" in ArcMap we were able to compare the distance of the geocoded mines that we had done to that of our classmates to attempt to determine how accurate and precise the geocoding of the mines actually was, as well as how accurate we were in normalizing the table provided from the Wisconsin DNR.






Results

As the results will show below, even if the data is normalized in way which is standard, concise, and regulated, the geocoding of that information will be drastically different based on the individual who has geocoded it. The best practice would be to start entering the information in a standard format as the table is being created. 

In talking with Dr. Hupy about this issue, she pointed out that this table may not have been intended for outside use, or in the making of the table, it was not tailored in a way that a GIS user would be able to use, but rather as a record. This explanation does make sense as to why the table is not GIS friendly, but even as a record for internal use, standard entry of addresses would make this table much more easy to understand for every user, not just GIS users. It is as important to standardize practices outside of GIS as to allow for anyone to look at and evaluate work. This includes information only one person would look at, I am sure that everyone has had the experience of looking back at something they wrote and wondering, "what did I write or mean by that?", if the information being taken down is recorded in a similar way then that dose not happen.

For GIS specifically the address need to be in a standard format for the geocoder to work, in the first table (Fig 3), the addresses (much like the first example in Fig 1) are not entered in a standard format even if they contain the same information.

Fig 3. Wisconsin DNR Excel Table with site information before normalization
 In order to normalize the data and have the geocoder work correctly, the individual components of the Address column need to be split up according to the individual components the geocoder uses to find that sites specific location.

As seen in Fig 4 (below) the individual components all require their own column with their own heading in order to be interpreted by the geocoder.


Fig 4. Normalized Table



As you can see the address has been split into sections such as street address, city/township, Zip code,  and County. Now that the geocoder can tease the individual address components apart, the site can be located.

Once located these geocoded sites were compared to the actual locations of the sites, provided from GPS data from the DNR (Fig 5 and Fig 6, below)

After mapping these sites other users geocoded locations were added to the map to compare the differences in the distances between all the users geocoded sites (Fig 7, below).


Fig 5. A map of the actual frac sand processing sites in Wisconsin. The orange squares represent individual sites.

Fig 6. A map of the locations that I geocoded as compared to their actual locations.


Fig 7. A map of geocoded locations of classmates as compared to the actual mine locations. 
Once all the sites were added to the map, the "near tool" was ran and distances were determined between the sites and the mine location. In order to evaluate these data an error table was generate (Fig 8, below).

Fig 8. Error table, compairing distance of geocoded locations among users for these specific mines.


The columns on the left report the actual distance (meters) that the geocoded site was placed compared to its location, with one column for each user. The statistics on the right are from the Excel Statistical analysis tool, and show the mean, median and standard deviation for each user as well as the minimum and maximum distance away a site was placed. I had one of the largest maximum distances from an actual mine location at 63106 meters, but every user had mean distance of over 2000 meters away from their sites on average.


Discussion 

As both Figure 8 (above) and Figure 9 (below) show the differences in the geocoding of the users. While everyone did have at least one site that was within 100 meters, and for the most part everyone was in the same county. But in actual situations where decisions need to be based off of specific locations the information provided to us would NOT have been accurate enough to base a decision on. With out any information (which was provided) to determine the actual locations of the mines there would be no way to determine if the locations we geocoded were actually correct. With a standardized input and data on how accurate geocoding is we could determine some type of standard error, to include in the data analysis but at this point in time we dont know those figures.

Fig 9. A close up of the geolocated sites for each user as well as the actual mine sites. 

There are two main sources of error in Geographic Data, Operational Errors, and Inherent Errors.
Operational Errors in Geographic Data Operational errors occur mostly during the operation of the procedures for collecting, managing, and using geographic data

Inherent Errors occur as a result of the special nature of geographic data. Geographic data, as representations of the real world in a certain data model, are necessarily incomplete and generalized.

In this assignment we have encountered both types of errors, from the start of the project the non standardized format of the table resulted in operational errors. Not having a set format or procedure for entering geographic data resulted in end users attempting to interrupt information that they had previously seen, which at best can push accuracy and precision of data off at the onset of the assignment and may propagate through out the rest of the operations.

Inherent errors occurred as gecoding addresses was partially based on the operational error of normalizing data but also from the inherent errors of the geocoding tool and the user using it. It is up to the user to determine if the geocoding tool has selected the right address based on the information at hand, if that information is missing or wrong, the user will tell the tool that the location selected is correct.


Fig 10. Sources of Error in Geographic data.
The errors that we are specifically encountering are Data automation and compilation and data processing. Specifically digitizing the mine sites, attribute data input, format translation. These errors propagate through to the analysis stage and result in the differences as seen in the figure 8 error table.  

Conclusion

If you ever have to record information that could be used in an analysis standardize the data entry, format, and be consistent when recording that information.


References

http://www.sco.wisc.edu/plssfinder/plssfinder.html