美文网首页R语言学习
R005 R语言数据的输入

R005 R语言数据的输入

作者: caoqiansheng | 来源:发表于2020-08-11 00:01 被阅读0次

R可从键盘、文本文件、Microsoft Excel和Access、流行的统计软件、特殊格式的文件、多种关系型数据库管理系统、专业数据库、网站和在线服务中导入数据


image.png

1.使用键盘输入数据

也许输入数据最简单的方式就是使用键盘了。有两种常见的方式:

  • 用R内置的文本编辑器和直接在代码中嵌入数据。R中的函数edit()会自动调用一个允许手动输入数据的文本编辑器,在Windows上调用函数edit()的结果如图所示,单击列的标题,你就可以用编辑器修改变量名和变量类型(数值型、字符型)。你还可以通过单击未使用列的标题来添加新的变量。编辑器关闭后,结果会保存到之前赋值的对象中(本例中为mydata)。再次调用mydata <- edit(mydata),就能够编辑已经输入的数据并添加新的数据。语句mydata <- edit(mydata)的一种简捷的等价写法是fix(mydata)。
    image.png
  • 直接在你的程序中嵌入数据集
image.png

2.从带分隔符的文本文件导入数据

可以使用read.table()从带分隔符的文本文件中导入数据。此函数可读入一个表格格式的文件并将其保存为一个数据框。表格的每一行分别出现在文件中每一行。其语法如下

read.table(file, header = FALSE, sep = "", quote = "\"'",
           dec = ".", numerals = c("allow.loss", "warn.loss", "no.loss"),
           row.names, col.names, as.is = !stringsAsFactors,
           na.strings = "NA", colClasses = NA, nrows = -1,
           skip = 0, check.names = TRUE, fill = !blank.lines.skip,
           strip.white = FALSE, blank.lines.skip = TRUE,
           comment.char = "#",
           allowEscapes = FALSE, flush = FALSE,
           stringsAsFactors = default.stringsAsFactors(),
           fileEncoding = "", encoding = "unknown", text, skipNul = FALSE)

read.csv(file, header = TRUE, sep = ",", quote = "\"",
         dec = ".", fill = TRUE, comment.char = "", ...)

read.csv2(file, header = TRUE, sep = ";", quote = "\"",
          dec = ",", fill = TRUE, comment.char = "", ...)

read.delim(file, header = TRUE, sep = "\t", quote = "\"",
           dec = ".", fill = TRUE, comment.char = "", ...)

read.delim2(file, header = TRUE, sep = "\t", quote = "\"",
            dec = ",", fill = TRUE, comment.char = "", ...)
函数read.table()的选项

3.导入Excel数据

  • 读取一个Excel文件的最好方式,就是在Excel中将其导出为一个逗号分隔文件(csv),并使用前文描述的方式将其导入R中。此外,你可以用xlsx包直接地导入Excel工作表。

  • readxl包

install.packages("readxl")
library(readxl)
Usage

read_excel(path, sheet = NULL, range = NULL, col_names = TRUE,
  col_types = NULL, na = "", trim_ws = TRUE, skip = 0,
  n_max = Inf, guess_max = min(1000, n_max),
  progress = readxl_progress(), .name_repair = "unique")

read_xls(path, sheet = NULL, range = NULL, col_names = TRUE,
  col_types = NULL, na = "", trim_ws = TRUE, skip = 0,
  n_max = Inf, guess_max = min(1000, n_max),
  progress = readxl_progress(), .name_repair = "unique")

read_xlsx(path, sheet = NULL, range = NULL, col_names = TRUE,
  col_types = NULL, na = "", trim_ws = TRUE, skip = 0,
  n_max = Inf, guess_max = min(1000, n_max),
  progress = readxl_progress(), .name_repair = "unique")
Arguments

path    
Path to the xls/xlsx file.
sheet   
Sheet to read. Either a string (the name of a sheet), or an integer (the position of the sheet). Ignored if the sheet is specified via range. If neither argument specifies the sheet, defaults to the first sheet.
range   
A cell range to read from, as described in cell-specification. Includes typical Excel ranges like "B3:D87", possibly including the sheet name like "Budget!B2:G14", and more. Interpreted strictly, even if the range forces the inclusion of leading or trailing empty rows or columns. Takes precedence over skip, n_max and sheet.
col_names   
TRUE to use the first row as column names, FALSE to get default names, or a character vector giving a name for each column. If user provides col_types as a vector, col_names can have one entry per column, i.e. have the same length as col_types, or one entry per unskipped column.
col_types   
Either NULL to guess all from the spreadsheet or a character vector containing one entry per column from these options: "skip", "guess", "logical", "numeric", "date", "text" or "list". If exactly one col_type is specified, it will be recycled. The content of a cell in a skipped column is never read and that column will not appear in the data frame output. A list cell loads a column as a list of length 1 vectors, which are typed using the type guessing logic from col_types = NULL, but on a cell-by-cell basis.
na  
Character vector of strings to interpret as missing values. By default, readxl treats blank cells as missing data.
trim_ws 
Should leading and trailing whitespace be trimmed?
skip    
Minimum number of rows to skip before reading anything, be it column names or data. Leading empty rows are automatically skipped, so this is a lower bound. Ignored if range is given.
n_max   
Maximum number of data rows to read. Trailing empty rows are automatically skipped, so this is an upper bound on the number of rows in the returned tibble. Ignored if range is given.
guess_max   
Maximum number of data rows to use for guessing column types.
progress    
Display a progress spinner? By default, the spinner appears only in an interactive session, outside the context of knitting a document, and when the call is likely to run for several seconds or more. See readxl_progress() for more details.
.name_repair    
Handling of column names. By default, readxl ensures column names are not empty and are unique. If the tibble package version is recent enough, there is full support for .name_repair as documented in tibble::tibble(). If an older version of tibble is present, readxl falls back to name repair in the style of tibble v1.4.2.

4.导入SPSS数据

IBM SPSS数据集可以通过foreign包中的函数read.spss()导入到R中,也可以使用Hmisc包中的spss.get()函数。函数spss.get()是对read.spss()的一个封装,它可以为你自动设置后者的许多参数,让整个转换过程更加简单一致,最后得到数据分析人员所期望的结果。

5.导入SAS数据

R中设计了若干用来导入SAS数据集的函数,包括foreign包中的read.ssd(), Hmisc包中的sas.get(),以及sas7bdat包中的read.sas7bdat()。如果你安装了SAS, sas.get()是一个好的选择。

5.Rstudio数据导入

image.png

相关文章

  • R005 R语言数据的输入

    R可从键盘、文本文件、Microsoft Excel和Access、流行的统计软件、特殊格式的文件、多种关系型数据...

  • day5 阿来

    继续学习R语言 R语言数据学习 数据R语言学习.png 数据输入 数据输出 总结 R语言学习的第二天,熟悉了很多操...

  • R语言_数据输入

    read.table()从带分隔符的文本文件中导入数据。可以读取csv、txtmydataframe <- rea...

  • 数据结构入门

    R语言一二章主要讲了安装R和RStudio、数据结构和数据输入(键盘输入、csv文件输入、xlsx文件输入)。 一...

  • R语言——数据的输入

    可供R导入的数据源:统计软件(SAS、SPSS、Stata)、文本文件(ASCII、XML)、数据库管理系统(SQ...

  • 第2章 R编程入门(一):数据集

    2.1 R语言 R是一种解释性语言,输入后可直接给出结果。R功能烤函数实现,函数形式如下:函数(输入数据, 参数=...

  • 数据挖掘与R语言

    《数据挖掘与R语言》本书首先简要介绍了R软件的基础知识(安装、R数据结构、R编程、R的输入和输出等)。然后通过四个...

  • R语言 数据重塑

    R语言中的数据重塑是关于改变数据被组织成行和列的方式。 大多数时间R语言中的数据处理是通过将输入数据作为数据帧来完...

  • 学习小麦组Day4笔记——入门R语言 张震

    认识R语言 什么是R语言 类似编程语言,不要有心理负担,这不时说让你去编程,是输入解释型的语言。主要包括变量,数据...

  • 第2章 创建数据集

    《R语言实战》笔记系列 本章学习大纲 1.R的数据结构 2.输入数据 3.导入数据 4.标注数据 第一部分 R的数...

网友评论

    本文标题:R005 R语言数据的输入

    本文链接:https://www.haomeiwen.com/subject/nousdktx.html