如何在Android Studio中将Pdf文件转换为文本

提兵百万西湖上,立马吴山第一峰!这篇文章主要讲述如何在Android Studio中将Pdf文件转换为文本相关的知识,希望能为你提供帮助。
我想从android中的文件管理器中选择一个pdf文件,并将其转换为文本,以便文本到语音可以读取它。我正在从android开发者网站关注此文档;但是,此示例用于打开文本文件。我正在使用PdfReader类/库来打开文件并转换为文本。但我不知道如何将其与Uri集成。这是我需要使用PdfReader从pdf转换为文本的代码

PdfReader pdfReader = new PdfReader(file.getPath()); stringParser = PdfTextExtractor.getTextFromPage(pdfReader, 1).trim(); pdfReader.close();

我正在使用意图呼叫文件管理器,以便用户可以选择pdf文件
fab.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View view) { intent = new Intent(Intent.ACTION_OPEN_DOCUMENT); intent.setType("*/*"); startActivityForResult(intent, READ_REQUEST_CODE); } });

然后我要获取uri并打开文件
@Override protected void onActivityResult(int requestCode, int resultCode, Intent resultData) { if (requestCode == READ_REQUEST_CODE & & resultCode == Activity.RESULT_OK) { if(resultData != null) { Uri uri = resultData.getData(); Toast.makeText(MainActivity.this, filePath , Toast.LENGTH_LONG).show(); readPdfFile(uri); } } }private String readTextFromUri(Uri uri) throws IOException { StringBuilder stringBuilder = new StringBuilder(); try (InputStream inputStream = getContentResolver().openInputStream(uri); BufferedReader reader = new BufferedReader( new InputStreamReader(Objects.requireNonNull(inputStream)))) { String line; while ((line = reader.readLine()) != null) { stringBuilder.append(line); } } return stringBuilder.toString(); }

答案
public class SyncPdfTextExtractor { // TODO: When you have your own Premium account credentials, put them down here: private static final String CLIENT_ID = "FREE_TRIAL_ACCOUNT"; private static final String CLIENT_SECRET = "PUBLIC_SECRET"; private static final String ENDPOINT = "https://api.whatsmate.net/v1/pdf/extract?url="; /** * Entry Point */ public static void main(String[] args) throws Exception { // TODO: Specify the URL of your small PDF document (less than 1MB and 10 pages) // To extract text from bigger PDf document, you need to use the async method. String url = "https://www.harvesthousepublishers.com/data/files/excerpts/9780736948487_exc.pdf"; SyncPdfTextExtractor.extractText(url); }/** * Extracts the text from an online PDF document. */ public static void extractText(String pdfUrl) throws Exception { URL url = new URL(ENDPOINT + pdfUrl); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setDoOutput(true); conn.setRequestMethod("GET"); conn.setRequestProperty("X-WM-CLIENT-ID", CLIENT_ID); conn.setRequestProperty("X-WM-CLIENT-SECRET", CLIENT_SECRET); int statusCode = conn.getResponseCode(); System.out.println("Status Code: " + statusCode); InputStream is = null; if (statusCode == 200) { is = conn.getInputStream(); System.out.println("PDF text is shown below"); System.out.println("======================="); } else { is = conn.getErrorStream(); System.err.println("Something is wrong:"); }BufferedReader br = new BufferedReader(new InputStreamReader(is)); String output; while ((output = br.readLine()) != null) { System.out.println(output); } conn.disconnect(); }} ------------------------------------Copying above code follow below Steps-Specify the URL of your online PDF document on line 20. Replace the Client ID and Secret on lines 10 and 11 if you have your own credentials.

另一答案【如何在Android Studio中将Pdf文件转换为文本】使用此摇篮:-
implementation 'com.itextpdf:itextg:5.5.10'

try { String parsedText=""; PdfReader reader = new PdfReader(yourPdfPath); int n = reader.getNumberOfPages(); for (int i = 0; i < n ; i++) { parsedText= parsedText+PdfTextExtractor.getTextFromPage(reader, i+1).trim()+" "; //Extracting the content from the different pages } System.out.println(parsedText); reader.close(); } catch (Exception e) { System.out.println(e); }


    推荐阅读