Extending Matroid Functionality Google Summer of Code 2016

Monday, August 22, 2016

Overview of what was done

My project has been extending the functionality of SageMath in a matroid direction.
As part of my application, and before the summer officially started, I worked on two tickets: https://trac.sagemath.org/ticket/20290 and https://trac.sagemath.org/ticket/14666. The first was fixing a typo (and learning how to use the interface), and the second one modified the code to find a maximum weighted basis of a matroid so that a user could also see if there was exactly one maximum weighted basis. These are both currently incorporated into official release version of SageMath.

At the beginning of the summer, I was focused on adding certificates to the pre written algorithms is_isomorphic(), chordal functions, has_minor(), and has_line_minor(). All of these are closed tickets except the last one, which had a merge conflict. This also enabled me to get a feel for the documentation culture of my organization.

The bulk of my project has been working on implementing An Almost Linear-Time Algorithm for Graph Realization by Robert Bixby and Donald Wagner. This algorithm was written with data structures that didn't exactly match the code base that I was incorporating the function into, so some changes were made there, and some simple (but not necessarily easy) supporting functions were added. There are still some bugs in the code, whose current version can be found here. Much of the rest of this post will be devoted to explaining the data structures that we used for the algorithm. It is aimed mostly at whoever (hopefully future me) is going to finish this function.

We used two new data structures Node, and Decomposition. The decomposition is composed of nodes and relations between them. In particular, it contains a directed tree, where each vertex corresponds to a node. A decomposition also stores information which is useful to the functions that need it. The root of the tree is stored, as are the nodes which contain the first and last verticies of the hypopath along with these verticies. Also stored are integers to makes sure that we don't double name two verticies or two edges the same thing.

A node contains a graph, a parent marker edge, and a parent marker vertex. The latter is one of the vertices of the parent marker edge, and is manipulated so that it is the edge which will end up being included in the path that comes from the hypopath. It also stores an integer T, which depends on the iteration of adding edges, and is stored after being computed.

The flow structure of the main functions is given below. Each function is a decomposition function.

Here is the list of all the functions and the status of each of them. Most of them are supporting functions, with the exception of relink1, typing, relink2, and hypopath from section 4 of the paper, squeeze and update from section 5, and is_graphic from section 6.

Nodes

get_graph(self)

Done

get_parent_marker(self)

Done

get_named_edge(self, f)

Done

get_parent_marker_edge(self)

Done

get_f(self)

Done

set_f(self, int n)

Done

is_polygon(self)

Done

s_path(self, P)

Done

is_cycle(self, P)

Done

_T(self, P, Z=*)

This will correctly give the T value when self is a leaf of the reduced arborescence. It does not correctly compute the T value otherwise.

__relink1(self, Z=*, WQ=*)

Done

__relink2(self, Z=*, WQ=*)

Done

get_T(self)

Done

set_T(self, int T)

Done

CunninghamEdmondsDecomposition

relink1(self, Q, Z=*, WQ=*)

Done

get_D_hat(self, P)

Done

T(self, N, P, T)

This is not done. It needs to be fixed so that it takes into account the types of the children of self.

__typing(self, P, pi)

This is not tested as it relies on T. There are, however, no known deficiencies with the algorithm.

__relink2(Q, Z=*, WQ=*)

Done

__hypopath(self, P)

This is not tested as it relies on __typing. The assigning of u_1 and u_2 needs to be fixed.

__squeeze(self, N, L)

Done

__update(self, P, C)

This is not tested as it relies on __hypopath. It is essentially done, except that the variables u_1, u_2, K_1, and K_2 are not necessarily computed correctly, and U2.4 is not written.

__is_graphic(self)

This is not done. G2 and G3 need to be written, and it needs to be tested. This cannot happen until the rest of the problems are fixed.

merge_with_parent(self, N, N_vertex=*, P_vertex=*)

This is done, but it doesn't use the f is N_vertex and P_vertex are undefined. This should probably be changed.

merge_branch(self, N, P)

This is written, but in order to insure that the intersection of P with this graph is always a path if possible, P should be replaced with P_0, and the parent markers of children that intersect P should be added to P_0 initially, and removed, in turn, when that child is merged with N.

__add_cycle(self, cycle)

Done

get_arborescence(self)

Done

get_nodes(self)

Done

get_root(self)

Done

__get_pi(self)

This is done, but it should be changed so that it can take a sub tree of self.arborescence as an input, and give pi on the reduced decomposition.

branch(self, N)

Done

get_parent(self, N)

Done

Monday, June 27, 2016

Midterm ish

My summer of code is broken up into several projects. There were a lot of small ones, a couple medium ones, and one large one. Right now, I'm in the midst of working on the large project. Basically, we want to feed Sage a collection of subsets of an edge set E, and have Sage tell us if there is a graph that has cycles which correspond to the subsets of E, and if so, to give a corresponding. This boils down to asking if a matroid is graphic, and asking for a graph that realizes the matroid.

For instance, if we give have E = {1, 2, 3, 4}, and our collection of sets is any three element subset of E, then we can't get an appropriate graph. To see this, we start constructing a graph. Our first cycle is {1, 2, 3}, There is only one graph on three elements that has this cycle, namely a triangle. To add the edge 4, we need to have a cycle {1, 2, 4}. But this means that we have to add 4 in parallel to the edge 3. This is a problem, because then {1, 3, 4}, in particular, is not a cycle of our graph.

This example illustrates a key idea of the algorithm. The set {1, 2} is a maximal set that is not contained in a cylce, so we skipped over those elements, and started with 3. We then added 3 and any needed elements of {1, 2} to our partial graph. And we kept adding elements till we either had a problem, or till we added all of the elements.

In our case, we didn't get so complicated of a graph that we had a choice about which graph to use for our partial graph. In general, this is not the case. It would be troublesome to check if we could add the new element to every graphs that realizes the already added elements, so we use a decomposition made possible by Whitney's 2 isomorphism theorem to check all of the graphs options at once. This of course makes the code more complicated. The algorithm that we are following comes from a paper by Ronald Bixby and Donald Wagner.

The tricky part, so far, has been trying to get information in and out of graphs. graph theorists care a lot about the vertices of a graph and much less about the edges of the graph. That is, they store their edges as a list of the two vertices that they are incident with, and a possible label. matroid theorists, however, care a lot more about the edges of a graph. This is true in general, and is true in particular for this project.

Thursday, June 2, 2016

First Week or so

Before coding started, I spent some time on code academy getting more familiar with the syntax of Python. I was impressed with the setup that they had (I would recommend it to my mom), and it helped me to learn python in a systematic way.

Since the 23rd I've been working on adding certificated (proof that we gave the right answer to a yes-no question) to some of the functions in the matroid part of Sage. For the first two days, I spent a lot of time trying to get Sage to compile. For a while, the problem was an error in a new release, and then I had some type of trouble on my end. I've also spent a good amount of time figuring out the ins and outs of documentation practices.

Monday, May 9, 2016

Getting Started

I first heard about Google Summer of Code a little over a year ago. It was something that I wanted to do for several reasons. I only had a chance to take a couple of programing classes in undergrad. (I didn't realized that I liked it till part way through my Junior year.) Since then, I've wanted to grow the length and complexity of projects that I was capable of successfully working on. Secondly, I like the idea of open source resources, because its free, and that lets poor college students use cool resources.

My project is building and expanding tools in Sage to be used by people studying matroid theory. A matroid is a notion of independence that generalizes the independence structure that is found in vector spaces and that comes from looking at cycleless subgraphs of graphs. Sage already has a lot of tools that let people work with matroids, mostly created by Stefan van Zwam and Rudi Pendavingh. My project focuses on a small collection of new tools.

I'll be working with Stefan and Michael on this project.