剖析Hadoop 的集群管理与安全机制
发布时间:2022-05-13 09:10:47 所属栏目:安全 来源:互联网
导读:HDFS数据管理 1、设置元数据与数据的存储路径,通过 dfs.name.dir,dfs.data.dir,fs.checkpoint.dir(hadoop1.x)、 hadoop.tmp.dir,dfs.namenode.name.dir,dfs.namenode.edits.dir,dfs.datanode.data.dir(hadoop2.x)等属性来设置; 2、经常执行HDFS文件系
HDFS数据管理 1、设置元数据与数据的存储路径,通过 dfs.name.dir,dfs.data.dir,fs.checkpoint.dir(hadoop1.x)、 hadoop.tmp.dir,dfs.namenode.name.dir,dfs.namenode.edits.dir,dfs.datanode.data.dir(hadoop2.x)等属性来设置; 2、经常执行HDFS文件系统检查工具FSCK,eg:hdfs fsck /liguodong -files -blocks; 复制 [root@slave1 mapreduce]# hdfs fsck /input Connecting to namenode via http://slave1:50070 FSCK started by root (auth:SIMPLE) from /172.23.253.22 for path /input at Tue Jun 16 21:29:21 CST 2015 .Status: HEALTHY Total size: 80 B Total dirs: 0 Total files: 1 Total symlinks: 0 Total blocks (validated): 1 (avg. block size 80 B) Minimally replicated blocks: 1 (100.0 %) 3、一旦数据发生异常,可以设置NameNode为安全模式,这时NameNode为只读模式; 操作命令:hdfs dfsadmin -safemode enter | leave | get | wait 复制 [root@slave1 mapreduce]# hdfs dfsadmin -report Configured Capacity: 52844687360 (49.22 GB) Present Capacity: 45767090176 (42.62 GB) DFS Remaining: 45766246400 (42.62 GB) DFS Used: 843776 (824 KB) DFS Used%: 0.00% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 4、每一个DataNode都会运行一个数据扫描线程,它可以检测并通过修复命令来修复坏块或丢失的数据块,通过属性设置扫描周期; dfs.datanode.scan.period.hourses, 默认是504小时。 MapReduce作业管理 查看Job信息:mapred job -list; 杀死Job:mapred job -kill; 复制 [root@slave1 mapreduce]# mapred job Usage: CLI <command> <args> [-submit <job-file>] [-status <job-id>] [-counter <job-id> <group-name> <counter-name>] [-kill <job-id>] [-set-priority <job-id> <priority>]. Valid values for priorities are: VERY_HIGH HIGH NORMAL LOW VERY_LOW [-events <job-id> <from-event-#> <#-of-events>] [-history <jobHistoryFile>] [-list [all]] [-list-active-trackers] [-list-blacklisted-trackers] [-list-attempt-ids <job-id> <task-type> <task-state>]. Valid values for <task-type> are REDUCE MAP. Valid values for <task-state> are running, completed [-kill-task <task-attempt-id>] [-fail-task <task-attempt-id>] [-logs <job-id> <task-attempt-id>] Generic options supported are -conf <configuration file> specify an application configuration file -D <property=value> use value for given property -fs <local|namenode:port> specify a namenode -jt <local|jobtracker:port> specify a job tracker -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath. -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines. (编辑:PHP编程网 - 黄冈站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |