Doxygen makes LAPACK Code Ugly ...

Open discussion regarding features, bugs, issues, vendors, etc.

Doxygen makes LAPACK Code Ugly ...

Postby michaellehn » Sat Jan 05, 2013 11:33 am

Up to version 3.3.1 the source code of LAPACK was very clean and slim. And in particular, the source code itself was a pleasant to read documentation (in ASCII format). Starting with version 3.4.0 the slickness and slimness of the source code was sacrificed for the sake of a (doxygen generated) documentation in HTML format.

I don't really understand why it was necessary to mess up the source code. The structure of the comments was both, simple and uniform. Hence it was easy to parse! All kind of documentation (man pages, info pages, html, etc.) could be generated without modifying the Fortran code, i.e. inserting tons of doxygen markup stuff.

In order to illustrate what I mean I setup a little "proof of concept" page:

Demo: Documentation Generated form LAPACK 3.3.1 Source Code

The basic idea is:
  • You have a source file like dgeqp3.f
  • The documentation gets extracted leading to some intermediate result like dgeqp3.doc.txt
  • The latter gets finally transformed into a man page, info page .... or a html page like dgeqp3.html

Adding some little markup hints can improve the look of the generated documentation without (completely) destroying the slickness and simplicity of the Fortran source code:
  • In dgeqp3.html names of matrices and vectors are written in typewriter style. Also, we added some LATEX code in the Further Details section.
  • The Fortran code dgeqp3.f is still rather nice and readable.

I guess there is no way back. But I would be interested in your opinion ...
michaellehn
 
Posts: 5
Joined: Tue Apr 13, 2010 2:59 pm

Re: Doxygen makes LAPACK Code Ugly ...

Postby CyLith » Sun Jan 06, 2013 7:59 am

Oh holy hell that's 100x better than what it is now! For some reason, they decided to add tons of awful markup so that they could generate the equally terrible online docs. Do you have scripts available for your conversion process? I'm working on a header-only reimplementation of the basic (and imho more important) parts of lapack in C++, and I'd love to generate nice documentation without seriously messing up all the code.
CyLith
 
Posts: 37
Joined: Sun Feb 08, 2009 7:23 am
Location: Stanford, CA

Re: Doxygen makes LAPACK Code Ugly ...

Postby michaellehn » Mon Jan 07, 2013 6:58 pm

Yesterday I added some simpler variants that keeps the descriptive text in verbatim boxes: dgeqp3.f becomes dgeqp3.html.

The result looks less nice but is easier to realize. I converted all of LAPACK 3.3.1 and only in a few cases some additional work would be required. For example in cgbequb.html still has a 'IMPLICIT NONE' artifact ...


CyLith wrote:Oh holy hell that's 100x better than what it is now! For some reason, they decided to add tons of awful markup so that they could generate the equally terrible online docs.


What I don't understand is why they actually included the markups for doxygen into the LAPACK source code. It would not be difficult to automatically generate a file like lapack-3.4.2/dgeqp3.f from a lapack-3.3.1/dgeqp3.f. This could be achieved through a small Perl script and easily integrated into the tool chain... As an alternative I consider to write a Perl script that reverses the process, so that I can have a local copy without classic LAPACK comments.


CyLith wrote:Do you have scripts available for your conversion process? I'm working on a header-only reimplementation of the basic (and imho more important) parts of lapack in C++, and I'd love to generate nice documentation without seriously messing up all the code.


I created the LAPACK documentation with DocTool which I wrote for my own Project FLENS. I will upload the code for extracting the LAPACK comments in the next days.

As I wrote DocTool mainly for myself I did not spend too much effort on documenting itself. However, you might get an idea by looking at some examples:

1.) For creating tutorials I adopted some ideas from Literate Programming. That means documentation, source code, compile instructions, post processing, etc. can all be put into a single file. When the documentation gets created the code actually get compiled and executed:

2.) As embedding the complete source code into the documentation sometimes is cumbersome you also can use an import directive:

3) Here an example that contains links, lists, tables, lists in tables, ...

4) I am using libclang for parsing C++ code and extracting function stubs, methods, ... from C++ source files.
  • So from a C++ file like gbmatrix.h
  • I create these stubs gbmatrix.doc
  • Without further editing the default still looks pretty poor: gbmatrix.html. The boxes are clickable and let you jump to the corresponding position in the header file.
  • Here an example where already some additional text was added: densevector.html
  • And here an example for documenting functions: lapack/ge/trf

5) Cross-Referencing is pretty easy due to libclang. If you look a source code listings like lapack-gelsy.cc you see that some words a (dotted) underlined. If you click these word you jump to their definition. This includes variables, functions, classes,... If an overload resolution is not possible at compile time the words is underlined red and you get a list of (clickable) possibilities.
michaellehn
 
Posts: 5
Joined: Tue Apr 13, 2010 2:59 pm

Re: Doxygen makes LAPACK Code Ugly ...

Postby admin » Thu Jan 10, 2013 1:15 am

Dear Michael,
First thank you for your post and the very detailed webpage.
The move to Doxygen was due to the lack of tools to generate readable documentation. We had especially a lot of problem with man pages generation, and the html pages were really basic. We ended up working more on tools/scripts than on the Library itself.
To tell you the truth, we were not happy to make the comments ugly..
But, we believed the value we added was worth it: the search, the graphs, the navigation links in the routines.

The LAPACK project lives because of feedbacks and contributions of users like you. If you believe there is a way get a more readable html documentation with the same features than Doxygen is offering, we will be happy to work with you on enhancing the LAPACK library. Our requirements are that the tools have to be automated and maintained as long as LAPACK will be around.

Thank you again
Julie
admin
Site Admin
 
Posts: 498
Joined: Wed Dec 08, 2004 7:07 pm

Re: Doxygen makes LAPACK Code Ugly ...

Postby michaellehn » Thu Mar 13, 2014 5:29 pm

Hi Julie,

sorry for the late reply. The most challenging ingredient for a good documentation tool for LAPACK is a good lexer and parser for Fortran. This is crucial for navigation links in routines, syntax highlighting, etc. As Fortran is pretty long around one would assume this to be simple. Parsing Fortran is actually simple. However, lexing (i.e. breaking Fortran code into tokens) is pretty tough. An interesting and prominent example is the code snippet

Code: Select all
       DO 10 I = 1, 20
          DO 10 I = 1.20
    10 CONTINUE

In the second line we actually have an assignment where a variable do10i gets the value 1.20. That's because spaces get ignored. Most Fortran documentation tools (including oxygen) ignore such cases. And in most cases it is okay because most programmers will follow some nice coding style. But for a good reason you demand a robust and maintainable tool. This can not be accomplished if the tool makes some inherent assumptions on certain coding styles. Sooner or later you would have to incorporate more and more 'coding style exceptions'. So for the sake of robustness and maintainability this job needs to be done right or not at all. And doing it right means: What ever a Fortran compiler can eat can be fed.

For my C/C++ documentation tool doctool I use the lexer/parser of the C++ compiler clang. For Fortran I did not find an equivalent tool. So based on f2c I coded a similar tool which I call f77crash (Fortran 77 cross referencing and syntax highlighting). Feeding the above snippet will simply give

Code: Select all
STATEMENT,3.6:3.7,0,33,do
CONSTANT,3.9:3.10,0,6,10
STATEMENT,3.14:3.14,0,76,=
CONSTANT,3.16:3.16,0,6,1
CONSTANT,3.19:3.20,0,6,20
STATEMENT,4.17:4.17,0,76,=
CONSTANT,4.19:4.22,0,7,1.20
STATEMENT,5.6:5.13,10,29,continue
VARIABLE,3.12:3.12,2,i
VARIABLE,4.9:4.15,1,do10i


So that is the key ingredient. The rest is cosmetics. I wrote a simple prototype which does the following:

  1. Documentation gets extracted for the LAPACK 3.3.1 source code files:
    1. Signature of the function. Clicking on it leads you to the source code.
    2. Purpose sections
    3. Arguments section where clicking on the variables lets you jump into the source file to the point of definition
    4. Call graph where nodes are linked with the corresponding source files
    5. Caller graph where nodes are linked with the corresponding source files
  2. Source files get cross referenced:
    1. If you click on a variable you jump to its definition
    2. If you click on a external function/subroutine you jump to its source file
    3. If for an external function/subroutine more than one destination is possible (e.g. ILAENV) you get a select box

Maybe the simplest way to explain it is by showing examples:

Actually you can browse through the complete LAPACK 3.3.1 package. You also find examples where you notice that the cosmetics needs more polishing. For example scaling of the graphs if you have many/few callers or calls:

Also the file browser is kept very simple.

All the tools are available from github: https://github.com/michael-lehn/LapackDoc

How the tools work is documented here: http://apfel.mathematik.uni-ulm.de/~lehn/LapackDoc/

Here you can browse though the documentation generated for the complete LAPACK 3.3.1: http://apfel.mathematik.uni-ulm.de/~lehn/lapack-3.3.1/dir.html

Cheers,

Michael

Edit: I also have a tenured position. So I have time to work on pet projects.
Last edited by michaellehn on Wed Mar 19, 2014 4:52 pm, edited 2 times in total.
michaellehn
 
Posts: 5
Joined: Tue Apr 13, 2010 2:59 pm

Re: Doxygen makes LAPACK Code Ugly ...

Postby michaellehn » Sat Mar 15, 2014 6:06 am

code responsibly ;-)

Image
michaellehn
 
Posts: 5
Joined: Tue Apr 13, 2010 2:59 pm


Return to User Discussion

Who is online

Users browsing this forum: Google [Bot] and 2 guests

cron