Mathgen

What?

Mathgen is a program to randomly generate professional-looking mathematics papers, including theorems, proofs, equations, discussion, and references. Try Mathgen for yourself! It’s a fork of SCIgen, a program which generates random papers in computer science.

Why?

Mostly because it’s funny! But there are some other possible uses:

  1. Impress your friends, colleagues and/or tenure committee with your prolific research output.
  2. There are a lot of shady journals out there. I bet one of them would accept a randomly generated paper. Try it, and let me know what happens!
  3. Cheat on your Erdős number.
  4. As a way of producing something possibly worthwhile from this project, I am offering randomly generated books for sale via lulu.com, and will donate $5.00 from each copy sold to the American Mathematical Society, in support of (actual, non-random) mathematical research. This would make a great gag gift for a mathematically inclined friend!
  5. A great way to come up with thesis topics for your grad students!
  6. More seriously, I think this project says something about the very small and stylized subset of English used in mathematical writing. This program only knows a handful of sentence templates, and yet I think its writing style is not far off from many published papers. You could argue this is bad (shows a lack of creativity) or good (makes papers more accessible to those with a limited knowledge of English), but I think we could stand to pay more attention to our writing styles, instead of unthinkingly relying on stock phrases.

How?

Mathgen uses a handwritten context-free grammar, essentially starting from a basic template and filling in blanks with textual elements of various types. Those elements could in turn contain other blanks, so the process continues recursively.

The generator itself is written in Perl. The text is then processed by $\LaTeX$ and BibTeX to produce the final output file.

The source code is available through Github at:

https://github.com/neldredge/mathgen

If you don’t want to mess with Git, you can just get a zip file containing the code.

Mathgen is free software and released under the terms of the GNU General Public License, version 2.0.

Who?

Mathgen was written by Nate Eldredge, incorporating code from SCIgen, by Jeremy Stribling, Max Krohn, and Dan Aguayo, without whom this project would not exist.
Jordan Eldredge wrote most of the web interface (the parts that are slick and work well; the ugly awkward parts are mine).

A list of names of famous mathematicians, used in the program, was extracted from the web site The Greatest Mathematicians of All Time by James Dow Allen, and is used by permission. A list of countries and other place names was taken from Wikipedia.

23 thoughts on “Mathgen

  1. Hello,
    I am first year PhD student at LSE. I just knew about this awesome idea. I think there is a very interesting space in Economics to test a couple of hypothesis. For instance, it seems to me the “Network” in Economics is incredibly important. (more than other sciences, i would say). I understand this as the chance to be accepted in a conference having famous co-authors is greater than not having famous co-authors, (maintining constant the quality of the paper (the same paper random process)
    what do you think?, How difficult would be to make this code for Economics?

    • In 1990 I was looking at using Markov chains to generate post-modern management papers, which I would try submitting. But I was too lazy to collect a large enough corpus. The great “advantage” of Markov chains is that most of your sentences won’t even be grammatical sentences of English, so getting those past reviewers is a stronger demonstration.

      Anyway, once Alan Sokal had his hand-crafted paper published in Social Text, reviewers probably got a bit more skeptical.

  2. Pingback: Pratyush Delicious links for October 20, 2012 | Pratyush Kotturu - KE5YQZ

  3. Pingback: Dancing p-Values &c « Kynosarges Weblog

  4. Pingback: "El artículo" de MathGen: la historia de cómo se la han colado a una publicación científica - Gaussianos

  5. I am no expert mathematician and don’t have experience in Perl; however, I think it might lend realism if the summation (or integration, product, coproduct, union, intersection, n-fold tensor product, …) dummy variables occasionally appeared in the summand (or integrand, etc.). Not all the time, of course; then it would make too much sense. But might it be possible to heighten the probability of them appearing?

  6. Pingback: Why Marcie Rathke is not Alan Sokal | That's Mathematics!

  7. Pingback: What kind of journal publishes nonsense? « Protons for Breakfast Blog

  8. Please, include more Logicians.

    Also, update the list of mathematicians to include more contemporaries (within the last 50 years) so can include names like Tao, Woodin, Ribet, and Wiles.

    Just a suggestion.

  9. Pingback: Mein erstes Paper als Wirtschaftsphilosoph | Wirtschaftsphilosoph

  10. Pingback: Another Mathgen paper accepted | That's Mathematics!

  11. I thought I might create a context-free grammar to generated website comments to comment more appropriately here, but …
    (1) I am too lazy.
    (2) I think it’s already been done, and has been deployed on YouTube.

  12. The first obvious problem I see is that the variables used in the equations aren’t defined beforehand. Other than that, it looks great!

  13. This looks impressive :)
    Is there a possibility that the php files and source for the website be made open source later in the future?

    • Maybe. The main reason I haven’t done this is paranoia that I missed some input validation somewhere, and if so I’d rather not make it too easy to exploit. It’s quite simple code, anyway. But I’ll keep this in mind.

  14. Pingback: New Mathgen book: Galois Knot Theory | That's Mathematics!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>