Tuesday, 29 June 2021

Unlabelled How to read paraquet file in java?

How to read paraquet file in java?

by deepak on June 29, 2021

If you are trying to read paraquet file, then it can be done by adding these 2 dependencies in pom file.

<groupId>org.apache.parquet</groupId>

<artifactId>parquet-hadoop</artifactId>

</dependency>

<groupId>org.apache.hadoop</groupId>

<artifactId>hadoop-common</artifactId>

</dependency>

</dependencies>

public class readParaquetFile {

private static Path path = new Path("C:\\Users\\deepak.mathpal\\Downloads\\userdata1.parquet");

private static void printGroup(Group g) {

int fieldCount = g.getType().getFieldCount();

for (int field = 0; field < fieldCount; field++) {

int valueCount = g.getFieldRepetitionCount(field);

Type fieldType = g.getType().getType(field);

String fieldName = fieldType.getName();

for (int index = 0; index < valueCount; index++) {

if (fieldType.isPrimitive()) {

System.out.println(fieldName + " " + g.getValueToString(field, index));

}

System.out.println("");

}

public static void main(String[] args) throws IllegalArgumentException {

Configuration conf = new Configuration();

try {

ParquetMetadata readFooter = ParquetFileReader.readFooter(conf, path, ParquetMetadataConverter.NO_FILTER);

MessageType schema = readFooter.getFileMetaData().getSchema();

ParquetFileReader r = new ParquetFileReader(conf, path, readFooter);

PageReadStore pages = null;

try {

while (null != (pages = r.readNextRowGroup())) {

final long rows = pages.getRowCount();

System.out.println("Number of rows: " + rows);

final MessageColumnIO columnIO = new ColumnIOFactory().getColumnIO(schema);

final RecordReader recordReader = columnIO.getRecordReader(pages, new GroupRecordConverter(schema));

for (int i = 0; i < rows; i++) {

final Object g = recordReader.read();

printGroup((Group) g);

}

} finally {

r.close();

}

} catch (IOException e) {

System.out.println("Error reading parquet file.");

e.printStackTrace();

}

Search This Blog

Breaking

Tuesday, 29 June 2021

How to read paraquet file in java?

No comments:

Post a Comment

Main Menu

Menu

Blog Archive

Recent

Popular

Comments

Tags

Blog Archive

Pages

Live Traffic

Total Pageviews

Popular Posts