How to Work with XML Processing in .NET
XLSX files store their data in XML parts inside an OPC ZIP package. Aspose.Cells FOSS for .NET processes these through four XML mapper classes: WorkbookXmlMapper, WorksheetXmlMapper, SharedStringTableXmlMapper, and StylesheetXmlMapper. Understanding these classes helps you diagnose parsing failures and correctly configure fault-tolerant loading. Install with dotnet add package Aspose.Cells_FOSS.
Step-by-Step Guide
Step 1: Install the Package
dotnet add package Aspose.Cells_FOSSStep 2: Import the Namespace
using Aspose.Cells_FOSS;Step 3: Understand the XML Mapper Responsibilities
Each mapper handles one XML part of the XLSX structure:
| Mapper | XML Part | Handles |
|---|---|---|
WorkbookXmlMapper | xl/workbook.xml | Workbook metadata, sheet list, defined names |
WorksheetXmlMapper | xl/worksheets/sheetN.xml | Cell data, formulas, hyperlinks, validations, conditional formats |
SharedStringTableXmlMapper | xl/sharedStrings.xml | De-duplicated string values |
StylesheetXmlMapper | xl/styles.xml | Cell styles, fonts, fills, borders |
These mappers are invoked automatically during Workbook construction and Save(). You do not instantiate them directly in application code.
Step 4: Handle XmlParsingException
XmlParsingException is thrown when a mapper encounters malformed XML that cannot be repaired. Enable TryRepairXml = true in LoadOptions to activate the mapper’s fault-tolerant parsing path.
using Aspose.Cells_FOSS;
var opts = new LoadOptions
{
TryRepairPackage = true,
TryRepairXml = true,
};
try
{
var wb = new Workbook("malformed.xlsx", opts);
Console.WriteLine("Loaded: " + wb.Worksheets.Count + " sheet(s)");
var diag = wb.LoadDiagnostics;
if (diag.HasRepairs)
Console.WriteLine("XML repairs applied. Data loss risk: " + diag.HasDataLossRisk);
}
catch (XmlParsingException ex)
{
Console.WriteLine("Unrecoverable XML error: " + ex.Message);
}
catch (WorkbookLoadException ex)
{
Console.WriteLine("Load failed: " + ex.Message);
}Step 5: Use LoadDiagnostics to Identify XML Issues
After a successful load, check LoadDiagnostics.Issues for DiagnosticEntry records to understand which XML repairs were applied and whether any resulted in data loss.
using Aspose.Cells_FOSS;
var opts = new LoadOptions { TryRepairXml = true };
var wb = new Workbook("file.xlsx", opts);
var diag = wb.LoadDiagnostics;
foreach (var entry in diag.Issues)
{
Console.WriteLine($"[{entry.Severity}] {entry.Code}");
Console.WriteLine($" Message: {entry.Message}");
Console.WriteLine($" RepairApplied: {entry.RepairApplied} DataLossRisk: {entry.DataLossRisk}");
}Common Issues and Fixes
XmlParsingException even with TryRepairXml = true.
The XML is so malformed that the fault-tolerant parser cannot recover it. This can happen with files created by non-standard tools that produce syntactically invalid XML. There is no recovery path for these files.
Styles are missing after load.
The StylesheetXmlMapper may have encountered a corrupt xl/styles.xml. Check LoadDiagnostics.Issues for entries with code related to styles, and DataLossRisk = true for affected cells.
Shared strings appear as empty cells.
A corrupt xl/sharedStrings.xml can cause cells that reference the shared string table to render as empty. Enable TryRepairXml to attempt recovery.
Frequently Asked Questions
Can I implement a custom XML mapper?
No. The XML mapper classes are sealed internal infrastructure and are not designed for extension.
Why is the SharedStringTableXmlMapper separate?
The OOXML specification separates repeated string values into a shared string table to reduce file size. The mapper handles reading and writing this table independently from cell data.
Does TryRepairXml fix all XML parsing issues?
TryRepairXml handles recoverable errors such as unclosed elements, missing namespaces, and truncated attribute values. Structurally valid but semantically inconsistent XML (e.g. formula tokens referencing non-existent cells) will still parse without error.