MARC / PERL: Koha Integrated Library System
MARC Tutorial
VALIDATING
MARC::Lint, available on CPAN and in cvs on SourceForge, has some extra goodies to allow you to validate records. MARC::Lint provides an extensive battery of tests, and it also provides a framework for adding more.
Using MARC::Lint
Here is an example of using MARC::Lint to generate a list of errors present in a batch of records in a file named 'file.dat':
| 1 | ## Example V1 |
|---|---|
| 2 | |
| 3 | use MARC::Batch; |
| 4 | use MARC::Lint; |
| 5 | |
| 6 | my $batch = MARC::Batch->new('USMARC','file.dat'); |
| 7 | my $linter = MARC::Lint->new(); |
| 8 | my $counter = 0; |
| 9 | |
| 10 | while (my $record = $batch->next() ) { |
| 11 | |
| 12 | $counter++; |
| 13 | |
| 14 | ## feed the record to our linter object. |
| 15 | $linter->check_record($record); |
| 16 | |
| 17 | ## get the warnings... |
| 18 | my @warnings = $linter->warnings(); |
| 19 | |
| 20 | ## output any warnings. |
| 21 | if (@warnings) { |
| 22 | |
| 23 | print "RECORD $counter\n"; |
| 24 | print join("\n",@warnings),"\n"; |
| 25 | |
| 26 | } |
| 27 | |
| 28 | } |
MARC::Lint is quite thorough, and will check the following when validating: presence of a 245 field, repeatability of fields and subfields, valid use of subfield within particular fields, presence of indicators and their values. All checks are based on MARC21 bibliographic format.
Customizing MARC::Lint
MARC::Lint makes no claim to check everything that might be wrong with a MARC record. In practice, individual libraries may have their own idea about what is valid or invalid. For example, a library may mandate that all MARC records with an 856 field should have a subfield z that reads "Connect to this resource". MARC::Lint does provide a framework for adding rules. It can be done using the object oriented programming technique of inheritance. In short, you can create your own subclass of MARC::Lint, and then use it to validate your records. Here's an example:
| 1 | ## Example V2 |
|---|---|
| 2 | |
| 3 | ## first, create our own subclass of MARC::Lint. |
| 4 | ## should be saved in a file called MyLint.pm. |
| 5 | |
| 6 | package MyLint; |
| 7 | use base qw(MARC::Lint); |
| 8 | |
| 9 | ## add a method to check that the 856 |
| 10 | ## fields contain a correct subfield z. |
| 11 | sub check_856 { |
| 12 | |
| 13 | ## your method is passed the MARC::Lint |
| 14 | ## and MARC::Field objects for the record. |
| 15 | my ($self,$field) = @_; |
| 16 | |
| 17 | if ($field->subfield('z') ne 'Connect to this resource') { |
| 18 | |
| 19 | ## add a warning to our lint object. |
| 20 | $self->warn("856 subfield z must read 'Connect to this resource'."); |
| 21 | |
| 22 | } |
| 23 | |
| 24 | } |