SoftCircuits.CsvParser
4.0.0
Prefix Reserved
See the version list below for details.
dotnet add package SoftCircuits.CsvParser --version 4.0.0
NuGet\Install-Package SoftCircuits.CsvParser -Version 4.0.0
<PackageReference Include="SoftCircuits.CsvParser" Version="4.0.0" />
paket add SoftCircuits.CsvParser --version 4.0.0
#r "nuget: SoftCircuits.CsvParser, 4.0.0"
// Install SoftCircuits.CsvParser as a Cake Addin #addin nuget:?package=SoftCircuits.CsvParser&version=4.0.0 // Install SoftCircuits.CsvParser as a Cake Tool #tool nuget:?package=SoftCircuits.CsvParser&version=4.0.0
CSV Parser
Install-Package SoftCircuits.CsvParser
Overview
CsvParser is a .NET library that makes it easy to work with comma-separated-value (CSV) files. (And can be customized to work with other delimiters). CsvParser includes basic classes to read and write CSV data, and also higher-level classes that automatically map class properties to CSV columns. The library correctly handles column values that contain embedded commas, quotes or other special characters. It even supports column values that span multiple lines. CsvParser is very fast and is designed to handle large data files efficiently. The object mapping classes run up to four times faster than the popular CsvHelper library.
CsvWriter and CsvReader Classes
These classes provide the simplest way to read and write CSV files. The example below writes several records of data to a CSV file.
// Write some data to disk
using CsvWriter writer = new(path);
// Write header
writer.Write("Name", "Email", "Phone", "Birthday");
// Write data
writer.Write("Bill Smith", "bsmith@domain.com", "555-1234", "10/29/1982 12:00:00 AM");
writer.Write("Susan Carpenter", "scarpenter@domain.com", "555-2345", "2/17/1995 12:00:00 AM");
writer.Write("Jim Windsor", "jwindsor@domain.com", "555-3456", "4/23/1989 12:00:00 AM");
writer.Write("Jill Morrison", "jmorrison@domain.com", "555-4567", "5/2/1979 12:00:00 AM");
writer.Write("Gary Wright", "gwright@domain.com", "555-5678", "2/18/1984 12:00:00 AM");
Note that the CsvWriter.Write()
method accepts any number of string parameters. It is also overloaded to handle string[]
and IEnumerable<string>
arguments. (Note: The legacy WriteRow()
and WriteRowAsync()
methods can be used to accomplish the same task but Write()
and WriteAsync()
are preferred.)
The next example reads all records from a CSV file.
// Read the data from disk
using CsvReader reader = new(path);
while (reader.Read())
Console.WriteLine(string.Join(", ", reader.Columns));
When the CsvReader.Read()
method returns true, the Columns
property holds the values that were read. This method returns false when the end of the file has been reached. (Note: The legacy ReadRow()
and ReadRowAsync()
methods can be used to accomplish the same task but Read()
and ReadAsync()
are preferred and have slightly better performance.)
CsvWriter <T> and CsvReader<T> Classes
These are higher level classes that will automatically map data between class properties and CSV columns. The following example defines a class, declares a collection with several instances of that class, then uses CsvWriter<T>
to write the data to a CSV file, and CsvReader<T>
to read it back again.
// This class will represent the data in the CSV file
class Person
{
public string Name { get; set; }
public string Email { get; set; }
public string Phone { get; set; }
public DateTime Birthday { get; set; }
}
// Define some sample data
List<Person> People = new()
{
new Person { Name = "Bill Smith", Email = "bsmith@domain.com", Phone = "555-1234", Birthday = new DateTime(1982, 10, 29) },
new Person { Name = "Susan Carpenter", Email = "scarpenter@domain.com", Phone = "555-2345", Birthday = new DateTime(1995, 2, 17) },
new Person { Name = "Jim Windsor", Email = "jwindsor@domain.com", Phone = "555-3456", Birthday = new DateTime(1989, 4, 23) },
new Person { Name = "Jill Morrison", Email = "jmorrison@domain.com", Phone = "555-4567", Birthday = new DateTime(1979, 5, 2) },
new Person { Name = "Gary Wright", Email = "gwright@domain.com", Phone = "555-5678", Birthday = new DateTime(1984, 2, 18) },
};
// Write the data to disk
// Note: Since all records are already in memory, you could replace the
// foreach loop with: writer.Write(People)
using (CsvWriter<Person> writer = new(path))
{
writer.WriteHeaders();
foreach (Person person in People)
writer.Write(person);
}
// Read the data from disk
List<Person> people = new();
using (CsvReader<Person> reader = new(path))
{
// Read header and use to determine column order
reader.ReadHeaders(true);
// Read data
Person? person;
while ((person = reader.Read()) != null)
people.Add(person);
}
It is important to note in the above example where the code that writes the data calls CsvWriter<T>.WriteHeaders()
. This writes a row with the name of each column. (The library gets the column names from the properties of the Person
class.) The code that reads the data calls CsvReader<T>.ReadHeaders()
to read that header data. Because the argument to CsvReader<T>.ReadHeaders()
is true
, this tells the code to use the header data to determine how to map the columns. For example, it can determine the column order and also detect if one or more properties are not mapped to any column.
If you can be sure the CSV file being read was created using the code above, the argument to CsvReader<T>.ReadHeaders()
could be false because you could be confident that the columns would be in the order expected, etc. But if someone else is supplying the CSV file, setting the CsvReader<T>.ReadHeaders()
argument to true
would allow it to work if the supplier put the columns in a different order.
Correctly mapping the class properties to the CSV columns is critical for these classes to work correctly. Here, the code maps the class properties to columns based on the column headers. The following sections will discuss other ways to map class properties to columns.
ColumnMap Attribute
The ColumnMapAttribute
can be applied to any class property or field to specify how that property or field is mapped to a CSV column. This attribute accepts any of the following arguments:
Name: Specifies a column name, allowing the column name to be different from the class property name.
Index: Specifies a property's 0-based column position. To ensure expected results, it is generally best to set the column Index for all properties when setting this property.
Exclude: Specifies whether or not the class property should be excluded, and not written to or read from any column.
ConverterType: Data converters convert individual class properties to strings and back again from strings to class properties. The CsvParser library includes converters for all basic data types (including
Guid
andDateTime
), basic nullable data types, basic data type arrays and basic nullable data type arrays. But you can override the data converter used for any class property. For example, you might want to write your own data converter to support custom property types, or when you are working with data not formatted as expected by the built-in data converters. A good example of this areDateTime
properties because there are so many ways to format date and time values.To override a data converter, create a class that implements the
IDataConverter
interface. The easiest way to do this in a type-safe manner is to derive your class fromDataConverter<T>
, whereT
is the type of the property you are converting. TheDataConverter<T>
class has two abstract methods,ConvertToString()
andTryConvertFromString()
, which must be overridden in your derived class.Finally, set the
ConverterType
argument of the property'sColumnMap
attribute to your custom convert class type. Note that if you set this property to a type that does not implementIDataConverter
, anArgumentOutOfRangeException
exception is thrown at runtime.
The example below uses the ColumnMap
attribute to customize the Person
class. It sets the Index
properties such that the CSV columns appear in the reverse order from how the properties are declared in the class, it excludes the Phone
property, and it causes the Birthday
header to use the name DOB. It also specifies a custom converter for the Birthday
property that stores the date (no time) in a very compact format.
// Add column mapping attributes to our data class
class Person
{
[ColumnMap(Index = 2)]
public string Name { get; set; }
[ColumnMap(Index = 1)]
public string Email { get; set; }
[ColumnMap(Exclude = true)]
public string Phone { get; set; }
[ColumnMap(Index = 0, Name = "DOB", ConverterType = typeof(DateTimeConverter))]
public DateTime Birthday { get; set; }
}
// Create a custom data converter for DateTime values
// Stores a date-only value (no time) in a very compact format
class DateTimeConverter : DataConverter<DateTime>
{
public override string ConvertToString(DateTime value)
{
int i = ((value.Day - 1) & 0x1f) |
(((value.Month - 1) & 0x0f) << 5) |
(value.Year) << 9;
return i.ToString("x");
}
public override bool TryConvertFromString(string s, out DateTime value)
{
try
{
int i = Convert.ToInt32(s, 16);
value = new DateTime(i >> 9, ((i >> 5) & 0x0f) + 1, (i & 0x1f) + 1);
return true;
}
catch (Exception)
{
value = DateTime.Now;
return false;
}
}
}
using (CsvWriter<Person> writer = new(path))
{
writer.WriteHeaders();
foreach (Person person in People)
writer.Write(person);
}
// Read the data from disk
List<Person> people = new();
using (CsvReader<Person> reader = new(path))
{
// Read header and use to determine column order
reader.ReadHeaders(false);
// Read data
Person? person;
while ((person = reader.Read()) != null)
people.Add(person);
}
Note that the code above will work correctly without the calls to the CsvWriter<T>.WriteHeaders()
and CsvReader<T>.ReadHeaders()
methods. We still included them for the benefit of anyone looking at the contents of the CSV file. But they are optional here because we used the ColumnMap
attribute to provide enough information about column order, etc. You can also see that we passed false
to the CsvReader<T>.ReadHeaders()
method. Again, this is because we have all the column information we need. However, if the file is being supplied by someone else, as described earlier, you might want to change this argument to true
, so that it can handle unexpected cases such as where columns are in a different order or are omitted. (If true
is passed to CsvReader<T>.ReadHeaders()
here, it would override any existing Index
and Exclude
mapping properties.)
Also note that, in the collection read back from disk, the Phone
property will always be null
because we excluded that property and it was not written to the CSV file.
MapColumns() Method
In some cases, you may want to set ColumnMap
attributes for a class you cannot directly modify. For example, the class might be part of a library you are using and you don't have the source code. In these cases, you can use the CsvWriter<T>.MapColumns()
and CsvReader<T>.MapColumns()
methods.
The example below creates a custom class that derives from ColumnMaps<T>
, where T
is the type of class being written or read. The constructor of this class must call MapColumn()
for each class property that it maps. This method supports a fluent interface to set the various mapping properties for each class property.
The code that writes the CSV file calls the CsvWriter<T>.MapColumns<T>()
method to register the custom mappings before any data is written. The code that reads the CSV file calls the CsvReader<T>.MapColumns<T>()
method in the same way. Both must use the same mapping in order for the data to be interpreted correctly. The easiest way to do this is to pass the same class to both methods.
// Create our custom mapping class
class PersonMaps : ColumnMaps<Person>
{
public PersonMaps()
{
// Note that only those properties set, and only those columns referenced
// will be modified. All columns and settings not referenced here retain
// their previous values.
MapColumn(m => m.Name).Index(2);
MapColumn(m => m.Email).Index(1);
MapColumn(m => m.Phone).Exclude(true);
MapColumn(m => m.Birthday).Index(0).Name("DOB").Converter(new DateTimeConverter());
}
}
// Write data to disk
using (CsvWriter<Person> writer = new(path))
{
// Register our custom mapping
writer.MapColumns<PersonMaps>();
writer.WriteHeaders();
foreach (Person person in People)
writer.Write(person);
}
// Read data from disk
List<Person> people = new();
using (CsvReader<Person> reader = new(path))
{
// Register our custom mapping
reader.MapColumns<PersonMaps>();
// Read header
reader.ReadHeaders(false);
// Read data
Person? person;
while ((person = reader.Read()) != null)
people.Add(person);
}
This example does exactly the same thing as the previous example but without modifying the Person
class.
CsvSettings Class
You can customize the way the library behaves by passing your own instance of the CsvSettings
class to any of the reader or writer constructors, as demonstrated in the following example. This code uses the CsvSettings
class to read a tab-separated-values (TSV) file. It sets the ColumnDelimiter
property to a tab. It also sets it to use single quotes instead of double quotes (something not useful very often, but is fully supported).
// Set custom settings
CsvSettings settings = new()
{
ColumnDelimiter = '\t',
QuoteCharacter = '\''
};
// Apply custom settings to CsvReader
using CsvReader reader = new(path, settings);
while (reader.Read())
Console.WriteLine(string.Join(", ", reader.Columns));
Additional Information
This code was originally derived from the article Reading and Writing CSV Files in C#.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 is compatible. net5.0-windows was computed. net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- No dependencies.
-
net5.0
- No dependencies.
-
net6.0
- No dependencies.
-
net7.0
- No dependencies.
NuGet packages (1)
Showing the top 1 NuGet packages that depend on SoftCircuits.CsvParser:
Package | Downloads |
---|---|
SoftCircuits.WebScraper
.NET library to scrape content from the Internet. Use it to extract information from Web pages in your own application. Extracted data is written to a CSV file. Supports paging and can cycle through all combinations of any number of replacement tags. Now targets .NET Standard 2.0 or .NET 5.0, and supports nullable reference types. |
GitHub repositories (1)
Showing the top 1 popular GitHub repositories that depend on SoftCircuits.CsvParser:
Repository | Stars |
---|---|
leandromoh/RecordParser
Zero Allocation Writer/Reader Parser for .NET Core
|
Version | Downloads | Last updated |
---|---|---|
4.5.1 | 2,761 | 3/13/2024 |
4.5.0 | 128 | 3/13/2024 |
4.1.0 | 6,310 | 3/31/2023 |
4.0.0 | 1,046 | 3/15/2023 |
3.5.1 | 10,714 | 12/22/2021 |
3.5.0 | 369 | 12/12/2021 |
3.0.1 | 720 | 10/1/2021 |
3.0.0 | 9,116 | 2/19/2021 |
2.4.3 | 6,665 | 1/14/2021 |
2.4.2 | 787 | 11/27/2020 |
2.4.1 | 486 | 10/15/2020 |
2.4.0 | 493 | 9/19/2020 |
2.3.0 | 544 | 9/13/2020 |
2.2.0 | 2,302 | 6/15/2020 |
2.1.8 | 459 | 6/15/2020 |
2.1.7 | 529 | 5/31/2020 |
2.1.6 | 495 | 5/29/2020 |
2.1.5 | 481 | 5/29/2020 |
2.1.4 | 456 | 5/29/2020 |
2.1.3 | 485 | 5/26/2020 |
2.1.2 | 1,074 | 4/19/2020 |
2.1.1 | 454 | 4/15/2020 |
2.1.0 | 488 | 3/13/2020 |
2.0.11 | 556 | 2/15/2020 |
2.0.10 | 458 | 2/14/2020 |
2.0.9 | 565 | 2/11/2020 |
2.0.8 | 506 | 1/20/2020 |
2.0.7 | 542 | 8/2/2019 |
2.0.6 | 506 | 8/2/2019 |
2.0.5 | 559 | 8/1/2019 |
2.0.4 | 519 | 7/31/2019 |
2.0.3 | 514 | 7/31/2019 |
2.0.2 | 524 | 7/30/2019 |
2.0.1 | 522 | 7/29/2019 |
2.0.0 | 517 | 7/28/2019 |
1.1.0 | 593 | 7/10/2019 |
1.0.0 | 1,557 | 7/9/2019 |
The low-level parser has been meticulously rewritten to be as fast as possible. The new version runs more than twice as fast. Added new CsvReader.Read() and CsvReader.ReadAsync() methods, which simplify loop syntax. Also added CsvWriter.Write() and CsvWriter.WriteAsync(). Added direct support for .NET 7.0. Miscellaneous code and documentation improvements.