HBASE-17554 Figure 2.0.0 Hadoop Version Support; update refguide

saintstack · saintstack · commit 1b2e80389d6c · 2018-04-11T08:28:17.000-07:00
Minor edit on hadoop section. Mark 2.8.3 as supported and 2.8.2 as NT.
diff --git a/src/main/asciidoc/_chapters/configuration.adoc b/src/main/asciidoc/_chapters/configuration.adoc
@@ -29,7 +29,7 @@
 
 This chapter expands upon the <<getting_started>> chapter to further explain configuration of Apache HBase.
 Please read this chapter carefully, especially the <<basic.prerequisites,Basic Prerequisites>>
-to ensure that your HBase testing and deployment goes smoothly, and prevent data loss.
+to ensure that your HBase testing and deployment goes smoothly.
 Familiarize yourself with <<hbase_supported_tested_definitions>> as well.
 
 == Configuration Files
@@ -164,9 +164,9 @@ It is recommended to raise the ulimit to at least 10,000, but more likely 10,240
 +
 For example, assuming that a schema had 3 ColumnFamilies per region with an average of 3 StoreFiles per ColumnFamily, and there are 100 regions per RegionServer, the JVM will open `3 * 3 * 100 = 900` file descriptors, not counting open JAR files, configuration files, and others. Opening a file does not take many resources, and the risk of allowing a user to open too many files is minimal.
 +
-Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the `ulimit -u` command. This should not be confused with the `nproc` command, which controls the number of CPUs available to a given user. Under load, a `ulimit -u` that is too low can cause OutOfMemoryError exceptions. See Jack Levin's major HDFS issues thread on the hbase-users mailing list, from 2011.
+Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the `ulimit -u` command. This should not be confused with the `nproc` command, which controls the number of CPUs available to a given user. Under load, a `ulimit -u` that is too low can cause OutOfMemoryError exceptions.
 +
-Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user's ulimit configuration, look at the first line of the HBase log for that instance. A useful read setting config on your hadoop cluster is Aaron Kimball's Configuration Parameters: What can you just ignore?
+Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user's ulimit configuration, look at the first line of the HBase log for that instance.
 +
 .`ulimit` Settings on Ubuntu
 ====
@@ -201,7 +201,8 @@ See link:https://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Suppo
 .Hadoop 2.x is recommended.
 [TIP]
 ====
-Hadoop 2.x is faster and includes features, such as short-circuit reads, which will help improve your HBase random read profile.
+Hadoop 2.x is faster and includes features, such as short-circuit reads (see <<perf.hdfs.configs.localread>>),
+which will help improve your HBase random read profile.
 Hadoop 2.x also includes important bug fixes that will improve your overall HBase experience. HBase does not support running with
 earlier versions of Hadoop. See the table below for requirements specific to different HBase versions.
 
@@ -226,7 +227,8 @@ Use the following legend to interpret this table:
 |Hadoop-2.7.0 | X | X | X
 |Hadoop-2.7.1+ | S | S | S
 |Hadoop-2.8.[0-1] | X | X | X
-|Hadoop-2.8.2+ | NT | NT | NT
+|Hadoop-2.8.2 | NT | NT | NT
+|Hadoop-2.8.3+ | NT | NT | S
 |Hadoop-2.9.0 | X | X | X
 |Hadoop-3.0.0 | NT | NT | NT
 |===
@@ -252,18 +254,20 @@ data loss. This patch is present in Apache Hadoop releases 2.6.1+.
 .Hadoop 2.y.0 Releases
 [TIP]
 ====
-Starting around the time of Hadoop version 2.7.0, the Hadoop PMC got into the habit of calling out new minor releases on their major version 2 release line as not stable / production ready. As such, HBase expressly advises downstream users to avoid running on top of these releases. Note that additionally the 2.8.1 was release was given the same caveat by the Hadoop PMC. For reference, see the release announcements for link:https://s.apache.org/hadoop-2.7.0-announcement[Apache Hadoop 2.7.0], link:https://s.apache.org/hadoop-2.8.0-announcement[Apache Hadoop 2.8.0], link:https://s.apache.org/hadoop-2.8.1-announcement[Apache Hadoop 2.8.1], and link:https://s.apache.org/hadoop-2.9.0-announcement[Apache Hadoop 2.9.0].
+Starting around the time of Hadoop version 2.7.0, the Hadoop PMC got into the habit of calling out new minor releases on their major version 2 release line as not stable / production ready. As such, HBase expressly advises downstream users to avoid running on top of these releases. Note that additionally the 2.8.1 release was given the same caveat by the Hadoop PMC. For reference, see the release announcements for link:https://s.apache.org/hadoop-2.7.0-announcement[Apache Hadoop 2.7.0], link:https://s.apache.org/hadoop-2.8.0-announcement[Apache Hadoop 2.8.0], link:https://s.apache.org/hadoop-2.8.1-announcement[Apache Hadoop 2.8.1], and link:https://s.apache.org/hadoop-2.9.0-announcement[Apache Hadoop 2.9.0].
 ====
 
 .Replace the Hadoop Bundled With HBase!
 [NOTE]
 ====
-Because HBase depends on Hadoop, it bundles an instance of the Hadoop jar under its _lib_ directory.
-The bundled jar is ONLY for use in standalone mode.
+Because HBase depends on Hadoop, it bundles Hadoop jars under its _lib_ directory.
+The bundled jars are ONLY for use in standalone mode.
 In distributed mode, it is _critical_ that the version of Hadoop that is out on your cluster match what is under HBase.
-Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues.
-Make sure you replace the jar in HBase across your whole cluster.
-Hadoop version mismatch issues have various manifestations but often all look like its hung.
+Replace the hadoop jars found in the HBase lib directory with the equivalent hadoop jars from the version you are running
+on your cluster to avoid version mismatch issues.
+Make sure you replace the jars under HBase across your whole cluster.
+Hadoop version mismatch issues have various manifestations. Check for mismatch if
+HBase appears hung.
 ====
 
 [[dfs.datanode.max.transfer.threads]]