Skip to content

Commit e18dd35

Browse files
authored
Merge pull request #10 from exelearning/feature/some-improvements
Add modern .elpx parsing and content inspection APIs
2 parents 422dcdd + 973096c commit e18dd35

File tree

7 files changed

+1862
-681
lines changed

7 files changed

+1862
-681
lines changed

README.md

100755100644
Lines changed: 85 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# eXeLearning .elp Parser for PHP
1+
# eXeLearning `.elp` / `.elpx` Parser for PHP
22

3-
Simple, fast, and extension-free parser for eXeLearning project files
3+
Simple parser for eXeLearning project files.
44

55
<p align="center">
66
<a href="#features">Features</a> |
@@ -16,29 +16,28 @@ Simple, fast, and extension-free parser for eXeLearning project files
1616

1717
## Features
1818

19-
**ELP Parser** provides a simple and intuitive API to parse eXeLearning project files (.elp):
19+
`ELPParser` supports the two eXeLearning project families described in the upstream format docs:
2020

21-
- Parse both version 2 and version 3 .elp files
22-
- Extract text content from XML
23-
- Detect file version
24-
- Extract entire .elp file contents
25-
- Retrieve full metadata tree
26-
- No external extensions required
27-
- Lightweight and easy to use (less than 4 KB footprint library)
28-
- Compatible with PHP 8.0 to PHP 8.5
21+
- Legacy `.elp` projects from eXeLearning 2.x based on `contentv3.xml`
22+
- Modern `.elpx` projects from eXeLearning 3+ based on `content.xml` and ODE 2.0
23+
- Modern `.elp` exports that also use `content.xml`
24+
- Detection of eXeLearning major version when the package exposes it
25+
- Heuristic detection of likely v4-style `.elpx` packages using root `content.dtd`
26+
- Extraction of normalized metadata, strings, pages, idevices and asset references
27+
- Safe archive extraction with ZIP path traversal checks
28+
- JSON serialization support
2929

3030
For more information, visit the [documentation](https://exelearning.github.io/elp-parser/).
3131

3232
## Requirements
3333

3434
- PHP 8.0+
3535
- Composer
36-
- zip extension
36+
- `zip` extension
37+
- `simplexml` extension
3738

3839
## Installation
3940

40-
Install the package via Composer:
41-
4241
```bash
4342
composer require exelearning/elp-parser
4443
```
@@ -51,78 +50,108 @@ composer require exelearning/elp-parser
5150
use Exelearning\ELPParser;
5251

5352
try {
54-
// Parse an .elp file
55-
$parser = ELPParser::fromFile('/path/to/your/project.elp');
56-
57-
// Get the file version
58-
$version = $parser->getVersion(); // Returns 2 or 3
59-
60-
// Get metadata fields
53+
$parser = ELPParser::fromFile('/path/to/project.elpx');
54+
55+
$version = $parser->getVersion();
6156
$title = $parser->getTitle();
6257
$description = $parser->getDescription();
6358
$author = $parser->getAuthor();
6459
$license = $parser->getLicense();
6560
$language = $parser->getLanguage();
6661

67-
// Get all extracted strings
68-
$strings = $parser->getStrings();
69-
70-
// Print extracted strings
71-
foreach ($strings as $string) {
62+
foreach ($parser->getStrings() as $string) {
7263
echo $string . "\n";
7364
}
7465
} catch (Exception $e) {
75-
echo "Error parsing ELP file: " . $e->getMessage();
66+
echo "Error parsing project: " . $e->getMessage();
7667
}
7768
```
7869

79-
### File Extraction
70+
### Format Inspection
8071

8172
```php
8273
use Exelearning\ELPParser;
8374

84-
try {
85-
$parser = ELPParser::fromFile('/path/to/your/project.elp');
86-
87-
// Extract entire .elp contents to a directory
88-
$parser->extract('/path/to/destination/folder');
89-
} catch (Exception $e) {
90-
echo "Error extracting ELP file: " . $e->getMessage();
91-
}
75+
$parser = ELPParser::fromFile('/path/to/project.elpx');
76+
77+
echo $parser->getSourceExtension(); // elp | elpx
78+
echo $parser->getContentFormat(); // legacy-contentv3 | ode-content
79+
echo $parser->getContentFile(); // contentv3.xml | content.xml
80+
echo $parser->getContentSchemaVersion(); // 2.0 for modern ODE packages
81+
echo $parser->getExeVersion(); // raw upstream version string when present
82+
echo $parser->getResourceLayout(); // none | content-resources | legacy-temp-paths | mixed
83+
var_dump($parser->hasRootDtd()); // true when content.dtd exists at archive root
84+
var_dump($parser->isLikelyVersion4Package());
9285
```
9386

94-
### Advanced Usage
87+
### Pages and Assets
9588

9689
```php
97-
// Convert parsed data to array
98-
$data = $parser->toArray();
90+
$pages = $parser->getPages();
91+
$visiblePages = $parser->getVisiblePages();
92+
$blocks = $parser->getBlocks();
93+
$idevices = $parser->getIdevices();
94+
$pageTexts = $parser->getPageTexts();
95+
$visiblePageTexts = $parser->getVisiblePageTexts();
96+
$firstPageText = $parser->getPageTextById($pages[0]['id']);
97+
$teacherOnlyIdevices = $parser->getTeacherOnlyIdevices();
98+
$hiddenIdevices = $parser->getHiddenIdevices();
99+
$assets = $parser->getAssets();
100+
$images = $parser->getImages();
101+
$audioFiles = $parser->getAudioFiles();
102+
$videoFiles = $parser->getVideoFiles();
103+
$documents = $parser->getDocuments();
104+
$assetsDetailed = $parser->getAssetsDetailed();
105+
$orphanAssets = $parser->getOrphanAssets();
106+
$metadata = $parser->getMetadata();
107+
```
108+
109+
In modern `content.xml` packages, assets usually live under paths such as `content/resources/...`.
110+
Older projects and some transitional exports may still reference legacy layouts such as `files/tmp/...`.
111+
The parser exposes this through `getResourceLayout()`.
99112

100-
// JSON serialization
101-
$jsonData = json_encode($parser);
113+
### Export JSON
102114

103-
// Export directly to a JSON file
104-
$parser->exportJson('path/to/output.json');
115+
```php
116+
$json = $parser->exportJson();
117+
$parser->exportJson('/path/to/output.json');
118+
```
105119

106-
// Retrieve full metadata as array
107-
$meta = $parser->getMetadata();
120+
### Extract Project Files
121+
122+
```php
123+
$parser->extract('/path/to/destination');
108124
```
109125

110-
## Error Handling
126+
## Version Compatibility
127+
128+
The parser distinguishes between project format and eXeLearning version:
111129

112-
The parser includes robust error handling:
113-
- Detects invalid .elp files
114-
- Throws exceptions for parsing errors
115-
- Supports both version 2 and 3 file formats
130+
- `getContentFormat()` tells you whether the package uses legacy `contentv3.xml` or modern `content.xml`
131+
- `getVersion()` reports the detected eXeLearning major version
132+
- In practice this means:
133+
- eXeLearning 2.x legacy `.elp` => version `2`
134+
- modern ODE-based `.elp` => usually version `3`
135+
- `.elpx` packages with root `content.dtd` are treated as likely v4-style packages and currently report version `4`
136+
- otherwise modern ODE-based packages default to version `3`
116137

117-
## Performance
138+
This distinction matters because some projects created with newer eXeLearning builds still identify themselves internally with `exe_version=3.0`, so strict `v4` detection is not always possible from the package alone.
139+
For that reason, the library combines explicit metadata with format heuristics:
118140

119-
- Lightweight implementation
120-
- Minimal memory footprint
121-
- Fast XML parsing using native PHP extensions
141+
- `.elpx`
142+
- `content.xml`
143+
- root `content.dtd`
144+
- optionally `content/resources/...` as the modern resource layout
145+
146+
## Error Handling
122147

123-
## Contributing
148+
The parser throws exceptions for:
124149

125-
Contributions are welcome! Please submit pull requests or open issues on our GitHub repository.
150+
- Missing files
151+
- Invalid ZIP archives
152+
- Unsupported project layouts
153+
- XML parsing failures
154+
- Unsafe archive entries during extraction
126155

127156
## License
128157

0 commit comments

Comments
 (0)