phpBibLib
back
Using phpBibLib (non-cached)

The library can be used directly on .bib files, or cached using a (SQLite) database. This file is an example of the non-cached version. Go here to see the demonstration of the cached version.

The basic outline is the same for both scenarios. First, one instantiates a Bibtex object and tells it where to obtain the references from, in this case the 'refs.bib' and 'references.bib' files.
Note that currently the `cite` and `citet` functions can both access a Bibtex object either as their first argument, or as default case, through the global $Site variable.

$Path['lib'] = './lib/';
require_once $Path['lib'] . 'lib_bibtex.inc.php';

$bib = new Bibtex('refs', 'references.bib');
       
$Site['bibtex'] = $bib; // if we want to use cite([Bibtex $bib], , 
                        //  , ..), without specifying what bib-object 
                        //  to cite from as it's first argument, store the
                        //  object in $Site['bibtex'].

The above Bibtex object parses the provided bibtex files on every run; fine for a small site using small bibtex files, like a personal list of your publications, but inefficient nevertheless.
Alternatively, one can use a database cached version of the library, which only re-parses the provided bibtex files if their date of modification do not match with data in the database. Although the code uses fairly standard SQL and the PDO interface, it has only been tested using SQLite.
Analogously to above, one can create a Database Cached version of a Bibtex object as follows:

$bdb = new PDO('sqlite:./dbib.sqlite');             // get a PDO for a sqlite database,
                                                    //  see below for the schema
$bib = new DBibtex($bdb, 'refs', 'references.bib'); // create the actual object. 
                                                    //  if no .bib's are given, only 
                                                    //  data in database is used, no 
                                                    //  attempts to reparse are made

Where the scheme for the database is:

CREATE TABLE bibmeta (fname TEXT, fdate TEXT);
CREATE TABLE bibdata (type TEXT, key TEXT, author TEXT, title TEXT, 
  publisher TEXT, year TEXT, booktitle TEXT, editor TEXT, journal TEXT, 
  volume TEXT, number TEXT, note TEXT, implementationurl TEXT, paperurl 
  TEXT, tags TEXT);

For your convenience, you'll find an empty sqlite database of this schema.

Once we have the Bibtex object, we can use it in two ways, i.e.,

  • by the standard LaTeX way of sprinkling some text with cite- and citet-functions, or,
  • by querying it directly, setting conditions on bibtex fields,
after which we can (optionally) pretty-print the resulting bibliography afterwards. Note that these two ways of selecting bibliogaphic entries can be combined. That is, under the hood, both methods simply 'activate' the selected/cited references, and upon printing the list, all 'activated' publications are printed.
We support all sorts of ordering of the bibliography; not however that if one uses in-text cites, and one requires something else than usage-based order, a pre-scan of the content containing the cites is necessary.

As possible use cases we see

  • easily maintainable, dynamic, bibtex file driven, personal (e.g. here) or research group publication listing websites.
  • neat scientifically cited overviews, such as for tutorials, e.g. on Mining Sets of Patterns
  • all the other, endless, possible uses one can have of having the data in some bibtex files readily available in php.

As I'm particularly bad at writing help-files, I'll just give a live demonstration of the static, non-DB cached, version of the library below. You can find the live demonstration of the DB cached version here. (Note that in the download package you'll find this demo included separately for the two variants.) The library has a (growing) number of undocumented features, so don't be surprised if it can already do what you want, although it is not specified here.

Usage Examples - Reparsing

First, one instantiates a Bibtex object and tells it where to obtain the references from, in this case the 'refs.bib' and 'references.bib' files. Note that currently the `cite` and `citet` functions can both access a Bibtex object either as their first argument, or as default case, through the global $Site variable.

$Site['bibtex'] = new Bibtex('refs', 'references.bib');
$bib = $Site['bibtex']; // for cite($bib, 'agrawal93'), where 
                        // $bib may be another Bibtex object than above

We can use the library in two ways, i.e., 1) by querying it directly, and printing the resulting list of publications, or, 2) by the standard LaTeX way of sprinkling some text with cite- and citet-functions, and (optionally) printing the list of referenced publications. For printing the list, these two can be combined. That is, under the hood, both methods simply 'activate' the selected/cited references, and upon printing the list, all 'activated' publications are printed.

Let us start by considering some examples of the 2nd method. For these examples, we consider the following text which we will store in a file called example-content.inc.php.

This and that has long been known to be such and so <?cite('vreeken15', 'siebes06', 
'DBLP:journals/tkde/MiettinenMGDM08');?>. Furthermore, <?citet('agrawal93')?> 
clearly did not <?cite($bib, 'agrawal93')?>.

This example includes two calls to the basic cite function, requesting two references present in the provided database, and one missing; and one call to the citet function. The former function displays a (list of) citations in the currently selected style, whereas the latter automatically prints the names of the authors of the reference and then adds its citation. Both functions use, if it is an object the first argument as the Bibtex library, otherwise they fall back to the '$Site['bibtex']' object.

Basic citing, without prescanning. Numeric references, usage-ordered.

In its most basic set-up, we can use the below code

$Site['bibtex']->SetBibliographyStyle('numeric'); // not necessary here, is the default
$Site['bibtex']->SetBibliographyOrder('usage');   // not necessary here, is the default
include 'example-content.inc.php';
$Site['bibtex']->PrintBibliography();

to obtain a numbered bibliography, and to sort the bibliography (and hence, deal the numbers in the order of) based on the order in which they are used in the content, resulting in

This and that has long been known to be such and so [?,1,2]. Furthermore, Agrawal et al. [3] clearly did not [3].

[1] Siebes, A., Vreeken, J. & van Leeuwen, M. Item Sets that Compress. In Proc. SDM'06, pages 393-404, 2006.
[2] Miettinen, P., Mielikäinen, T., Gionis, A., Das, G. & Mannila, H. The Discrete Basis Problem. IEEE Trans. Knowl. Data Eng., 20(10):1348-1362, 2008.
[3] Agrawal, R., Imielinksi, T. & Swami, A. Mining association rules between sets of items in large databases. In Proc. SIGMOD'93, pages 207-216, ACM, 1993.

Basic citing, without prescanning. Abbrv references, usage-ordered.

Alternatively, we can use the 'abbrv' citation style instead of 'numeric' in the above example, i.e., by using

$Site['bibtex']->SetBibliographyStyle('abbrv');

which will make the library spit out citations-keys of the first characters of the last name, up till the first last name starting with a capital, of up to the first three authors, adding a '+' if there are more authors, and the year in two digits. This gives us the following output:

This and that has long been known to be such and so [?,SVvL06,MMG+08]. Furthermore, Agrawal et al. [AIS93] clearly did not [AIS93].

[SVvL06] Siebes, A., Vreeken, J. & van Leeuwen, M. Item Sets that Compress. In Proc. SDM'06, pages 393-404, 2006.
[MMG+08] Miettinen, P., Mielikäinen, T., Gionis, A., Das, G. & Mannila, H. The Discrete Basis Problem. IEEE Trans. Knowl. Data Eng., 20(10):1348-1362, 2008.
[AIS93] Agrawal, R., Imielinksi, T. & Swami, A. Mining association rules between sets of items in large databases. In Proc. SIGMOD'93, pages 207-216, ACM, 1993.

Basic citing, without prescanning. Natbib references, usage-ordered

Third, we have the 'natbib' option

$Site['bibtex']->SetBibliographyStyle('abbrv');

This option gives us the names of the authors, if there are up to two, or write the last name of the first author and adds 'et al.', and also provides the year.

This and that has long been known to be such and so (?; Siebes et al., 2006; Miettinen et al., 2008). Furthermore, Agrawal et al. (1993) clearly did not (Agrawal et al., 1993).

Siebes, A., Vreeken, J. & van Leeuwen, M. (2006) Item Sets that Compress. In Proc. SDM'06, pages 393-404
Miettinen, P., Mielikäinen, T., Gionis, A., Das, G. & Mannila, H. (2008) The Discrete Basis Problem. IEEE Trans. Knowl. Data Eng., 20(10):1348-1362
Agrawal, R., Imielinksi, T. & Swami, A. (1993) Mining association rules between sets of items in large databases. In Proc. SIGMOD'93, pages 207-216, ACM

With prescanning. Numeric references, alphabetic-ordered

In practice, usage-based ordering might not be what we want. Instead, we might want to have the printed bibliography sorted alphabetically, or by year of publication. This means that cite cannot just deal number on the fly, and hence we are required to scan the content file before printing the citations. This correlates with the well-known LaTeX compile .tex - compile .bib - compile .tex loop. In our code, to have this pre-scanning, we have to replace the simple 'include' with

$Site['bibtex']->IncludeBibContent('example-content.inc.php');

which gives us the freedom using more fancy order styles, i.e. we have the options 'usage', 'alphabetic', 'year_a' and 'year_d' (where a stands for ascending and d for descending). Now, the following code

$Site['bibtex']->SetBibliographyStyle('numeric');
$Site['bibtex']->SetBibliographyOrder('alphabetic');
$Site['bibtex']->IncludeBibContent('example-content.inc.php', '$bib'); // where '$bib' is
$Site['bibtex']->PrintBibliography();                                  // optional if you 
                                                                       // use $Site['bibtex']

results in

This and that has long been known to be such and so [?,1,2]. Furthermore, Agrawal et al. [3] clearly did not [3].

[1] Siebes, A., Vreeken, J. & van Leeuwen, M. Item Sets that Compress. In Proc. SDM'06, pages 393-404, 2006.
[2] Miettinen, P., Mielikäinen, T., Gionis, A., Das, G. & Mannila, H. The Discrete Basis Problem. IEEE Trans. Knowl. Data Eng., 20(10):1348-1362, 2008.
[3] Agrawal, R., Imielinksi, T. & Swami, A. Mining association rules between sets of items in large databases. In Proc. SIGMOD'93, pages 207-216, ACM, 1993.

With prescanning. Numeric references, year-asc ordered.

And, as an example of using 'year_a' as option for the bibliography order,

$Site['bibtex']->SetBibliographyOrder('year_a');

we get

This and that has long been known to be such and so [?,3,2]. Furthermore, Agrawal et al. [1] clearly did not [1].

[1] Agrawal, R., Imielinksi, T. & Swami, A. Mining association rules between sets of items in large databases. In Proc. SIGMOD'93, pages 207-216, ACM, 1993.
[2] Miettinen, P., Mielikäinen, T., Gionis, A., Das, G. & Mannila, H. The Discrete Basis Problem. IEEE Trans. Knowl. Data Eng., 20(10):1348-1362, 2008.
[3] Siebes, A., Vreeken, J. & van Leeuwen, M. Item Sets that Compress. In Proc. SDM'06, pages 393-404, 2006.

Showing results of queries

As a final example (for now) consider the following code

$Site['bibtex']->SetBibliographyStyle('numeric');
$Site['bibtex']->SetBibliographyOrder('year_d');
$Site['bibtex']->Select(array('author' => 'Vreeken'));
$Site['bibtex']->PrintBibliography();

which neatly gives us all publications in the above-mentioned bib-files that have 'Vreeken' in the author field.

[1] Mampaey, M. & Vreeken, J. Summarising Data by Clustering Items. In Proc. ECMLPKDD'10, 2010.
[2] Mampaey, M., Tatti, N. & Vreeken, J. Tell Me What I Need To Know: Succinctly Summarising Data by Itemsets. In Proc. KDD'11, 2011.
[3] Remmerie, N., Vijlder, T.D., Valkenborg, D., Laukens, K., Smets, K., Vreeken, J., Mertens, I., Carpentier, S., Panis, B., Jaeger, G.d., Prinsen, E. & Witters, E. Unraveling tobacco BY-2 protein complexes with BN PAGE/LC-MS/MS and clustering methods. Journal of Proteomics, Elsevier, 2011.
[4] Smets, K. & Vreeken, J. The Odd One Out - Identifying and Characterising Anomalies. In Proc. SDM'11, 2011.
[5] Miettinen, P. & Vreeken, J. Model Order Selection for Boolean Matrix Factorization. In Proc. KDD'11, ACM, 2011.
[6] Tatti, N. & Vreeken, J. Comparing Apples and Oranges: Measuring Differences between Data Mining Results. In Proc. ECMLPKDD'11, 2011.
[7] Vreeken, J. & Siebes, A. Filling in the Blanks -- Krimp Minimisation for Missing Data. In Proc. ICDM'08, pages 1067-1072, 2008.
[8] Heikinheimo, H., Vreeken, J., Siebes, A. & Mannila, H. Low-Entropy Set Selection. In Proc. SDM'09, pages 569-579, 2009.
[9] van Leeuwen, M., Vreeken, J. & Siebes, A. Compression Picks the Item Sets that Matter. In Proc. ECML PKDD'06, pages 585-592, 2006.
[10] van Leeuwen, M., Vreeken, J. & Siebes, A. Identifying the Components. Data Min. Knowl. Discov., 19(2):173-292, Springer Netherlands, 2009.
[11] Siebes, A., Vreeken, J. & van Leeuwen, M. Item Sets that Compress. In Proc. SDM'06, pages 393-404, 2006.
[12] Tatti, N. & Vreeken, J. Finding Good Itemsets by Packing Data. In Proc. ICDM'08, pages 588-597, 2008.
[13] Vreeken, J., van Leeuwen, M. & Siebes, A. Characterising the Difference. In Proc. KDD'07, pages 765-774, 2007.

Note that alternatively, we could have selected on publications of a particular year, using

$Site['bibtex']->Select('author' => 'ke', 'year' => 2011)';
e.g. for publications in 2011 of authors whose name include 'ke'.

[1] Mampaey, M., Tatti, N. & Vreeken, J. Tell Me What I Need To Know: Succinctly Summarising Data by Itemsets. In Proc. KDD'11, 2011.
[2] Tatti, N. & Vreeken, J. Comparing Apples and Oranges: Measuring Differences between Data Mining Results. In Proc. ECMLPKDD'11, 2011.
[3] Smets, K. & Vreeken, J. The Odd One Out - Identifying and Characterising Anomalies. In Proc. SDM'11, 2011.
[4] Siebes, A. & Kersten, R. A Structure Function for Transaction Data. In Proc. SDM'11, SIAM, 2011.
[5] Miettinen, P. & Vreeken, J. Model Order Selection for Boolean Matrix Factorization. In Proc. KDD'11, ACM, 2011.
[6] Remmerie, N., Vijlder, T.D., Valkenborg, D., Laukens, K., Smets, K., Vreeken, J., Mertens, I., Carpentier, S., Panis, B., Jaeger, G.d., Prinsen, E. & Witters, E. Unraveling tobacco BY-2 protein complexes with BN PAGE/LC-MS/MS and clustering methods. Journal of Proteomics, Elsevier, 2011.

More options might become (or already are) available, but simply not documented. Check the code.

As a final remark, the usage-list of the current Bibtex object can be reset using

$Site['bibtex']->ResetBibliography();
which allows us to have different bibliography print-outs within the same document. w00t.

That's all for now. Go on, play.